Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1542476.1542485acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Merlin: specification inference for explicit information flow problems

Published: 15 June 2009 Publication History

Abstract

The last several years have seen a proliferation of static and runtime analysis tools for finding security violations that are caused by explicit information flow in programs. Much of this interest has been caused by the increase in the number of vulnerabilities such as cross-site scripting and SQL injection. In fact, these explicit information flow vulnerabilities commonly found in Web applications now outnumber vulnerabilities such as buffer overruns common in type-unsafe languages such as C and C++. Tools checking for these vulnerabilities require a specification to operate. In most cases the task of providing such a specification is delegated to the user. Moreover, the efficacy of these tools is only as good as the specification. Unfortunately, writing a comprehensive specification presents a major challenge: parts of the specification are easy to miss, leading to missed vulnerabilities; similarly, incorrect specifications may lead to false positives.
This paper proposes Merlin, a new approach for automatically inferring explicit information flow specifications from program code. Such specifications greatly reduce manual labor, and enhance the quality of results, while using tools that check for security violations caused by explicit information flow. Beginning with a data propagation graph, which represents interprocedural flow of information in the program, Merlin aims to automatically infer an information flow specification. Merlin models information flow paths in the propagation graph using probabilistic constraints. A naive modeling requires an exponential number of constraints, one per path in the propagation graph. For scalability, we approximate these path constraints using constraints on chosen triples of nodes, resulting in a cubic number of constraints. We characterize this approximation as a probabilistic abstraction, using the theory of probabilistic refinement developed by McIver and Morgan. We solve the resulting system of probabilistic constraints using factor graphs, which are a well-known structure for performing probabilistic inference.
We experimentally validate the Merlin approach by applying it to 10 large business-critical Web applications that have been analyzed with CAT.NET, a state-of-the-art static analysis tool for .NET. We find a total of 167 new confirmed specifications, which result in a total of 322 additional vulnerabilities across the 10 benchmarks. More accurate specifications also reduce the false positive rate: in our experiments, Merlin-inferred specifications result in 13 false positives being removed; this constitutes a 15% reduction in the CAT.NET false positive rate on these 10 programs. The final false positive rate for CAT.NET after applying Merlin in our experiments drops to under 1%.

References

[1]
D. Chandra and M. Franz. Fine-grained information flow analysis and enforcement in a java virtual machine. In Annual Computer Security Applications Conference, pages 463--475, 2007.
[2]
M. Dalton, H. Kannan, and C. Kozyrakis. Raksha: a flexible information flow architecture for software security. In Proceedings of the International Symposium on Computer Architecture, pages 482--493, 2007.
[3]
D. R. Engler, D. Y. Chen, and A. Chou. Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In In Proceedings of ACM Symposium on Operating Systems Principles, pages 57--72, 2001.
[4]
Fortify. Fortify code analyzer. http://www.ouncelabs.com/, 2008.
[5]
C. L. Goues and W. Weimer. Specification mining with few false positives. In Tools and Algorithms for the Construction and Analysis of Systems, 2009.
[6]
V. Haldar, D. Chandra, and M. Franz. Dynamic taint propagation for Java. In Proceedings of the Annual Computer Security Applications Conference, pages 303--311, Dec. 2005.
[7]
C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12:576--583, October 1969.
[8]
Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Kuo. Securing Web application code by static analysis and runtime protection. In Proceedings of the Conference on World Wide Web, pages 40--52, May 2004.
[9]
N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: a static analysis tool for detecting Web application vulnerabilities (short paper). In Proceedings of the Symposium on Security and Privacy, May 2006.
[10]
T. Kremenek, P. Twohey, G. Back, A. Y. Ng, and D. R. Engler. From uncertainty to belief: Inferring the specification within. In Symposium on Operating Systems Design and Implementation, pages 161--176, Nov. 2006.
[11]
M. Krohn, A. Yip, M. Brodsky, N. Cliffer, M. F. Kaashoek, E. Kohler, and R. Morris. Information flow control for standard os abstractions. In Proceedings of Symposium on Operating Systems Principles, pages 321--334, 2007.
[12]
F. R. Kschischang, B. J. Frey, and H. A. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2):498--519, 2001.
[13]
Z. Li and Y. Zhou. Pr-miner: Automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the European Software Engineering Conference, 2005.
[14]
B. Livshits. Improving Software Security with Precise Static and Runtime Analysis. PhD thesis, Stanford University, Stanford, California, 2006.
[15]
B. Livshits and M. S. Lam. Finding security errors in Java programs with static analysis. In Proceedings of the Usenix Security Symposium, pages 271--286, Aug. 2005.
[16]
B. Livshits and T. Zimmermann. DynaMine: Finding common error patterns by mining software revision histories. In Proceedings of the International Symposium on the Foundations of Software Engineering, pages 296-305, Sept. 2005.
[17]
M. Martin, B. Livshits, and M. S. Lam. Finding application errors and security vulnerabilities using PQL: a program query language. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications, Oct. 2005.
[18]
M. Martin, B. Livshits, and M. S. Lam. SecuriFly: Runtime vulnerability protection for Web applications. Technical report, Stanford University, Oct. 2006.
[19]
A. McIver and C. Morgan. Abstraction, Refinement and Proof of Probabilistic Systems. Springer, 2004.
[20]
A. McIver and C. Morgan. Abstraction and refinement in probabilistic systems. SIGMETRICS Performance Evaluation Review, 32:41--47, March 2005.
[21]
Microsoft Corporation. Microsoft Code Analysis Tool .NET (CAT.NET). http://www.microsoft. com/downloads/details.aspx?FamilyId= 0178e2ef-9da8-445e-9348-c93f24cc9f9d&displaylang=en, 3 2009.
[22]
T. Minka, J.Winn, J. Guiver, and A. Kannan. Infer.NET 2.2, 2009. Microsoft Research Cambridge. http://research.microsoft.com/infernet.
[23]
A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans. Automatically hardening Web applications using precise tainting. In Proceedings of the IFIP International Information Security Conference, June 2005.
[24]
OunceLabs, Inc. Ounce. http://www.ouncelabs.com/, 2008.
[25]
T. Pietraszek and C. V. Berghe. Defending against injection attacks through context-sensitive string evaluation. In Proceedings of the Recent Advances in Intrusion Detection, Sept. 2005.
[26]
M. K. Ramanathan, A. Grama, and S. Jagannathan. Static specification inference using predicate mining. In PLDI, 2007.
[27]
A. Sabelfeld and A. Myers. Language-based information-flow security. IEEE Journal on Selected Areas in Communications, 21(1):5--19, January 2003.
[28]
Z. Su and G. Wassermann. The essence of command injection attacks in web applications. In Proceedings of POPL, 2006.
[29]
S. Vandebogart, P. Efstathopoulos, E. Kohler, M. Krohn, C. Frey, D. Ziegler, F. Kaashoek, R. Morris, and D. Mazières. Labels and event processes in the Asbestos operating system. ACM Trans. Comput. Syst., 25(4):11, 2007.
[30]
L. Wall. Perl security. http://search.cpan.org/dist/perl/ pod/perlsec.pod.
[31]
W.Weimer and G. C. Necula. Mining temporal specifications for error detection. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 461--476, 2005.
[32]
J. Whaley, M. Martin, and M. Lam. Automatic extraction of objectoriented component interfaces. In Proceedings of the International Symposium on Software Testing and Analysis, pages 218--228, 2002.
[33]
Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In Proceedings of the Usenix Security Symposium, pages 271--286, Aug. 2006.
[34]
J. Yang and D. Evans. Perracotta: mining temporal API rules from imperfect traces. In Proceedings of the International Conference on Software Engineering, pages 282--291, 2006.
[35]
J. S. Yedidia, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. Exploring Artificial Intelligence in the New Millennium, pages 239--269, 2003.
[36]
N. Zeldovich, S. Boyd-Wickizer, E. Kohler, and D. Mazires. Making information flow explicit in HiStar. In Proceedings of the Symposium on Operating Systems Design and Implementation, pages 263--278, 2006.

Cited By

View all
  • (2024)Extent of spending behavior, problems encountered, and financial knowledge across generational cohorts among state universities and colleges employeesInternational Journal of ADVANCED AND APPLIED SCIENCES10.21833/ijaas.2024.02.02411:2(230-237)Online publication date: Feb-2024
  • (2024)Towards Finding Accounting Errors in Smart ContractsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639128(1-13)Online publication date: 20-May-2024
  • (2023)Beware of the Unexpected: Bimodal Taint AnalysisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598050(211-222)Online publication date: 12-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2009
492 pages
ISBN:9781605583921
DOI:10.1145/1542476
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 44, Issue 6
    PLDI '09
    June 2009
    478 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1543135
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. security analysis tools
  2. specification inference

Qualifiers

  • Research-article

Conference

PLDI '09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)10
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Extent of spending behavior, problems encountered, and financial knowledge across generational cohorts among state universities and colleges employeesInternational Journal of ADVANCED AND APPLIED SCIENCES10.21833/ijaas.2024.02.02411:2(230-237)Online publication date: Feb-2024
  • (2024)Towards Finding Accounting Errors in Smart ContractsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639128(1-13)Online publication date: 20-May-2024
  • (2023)Beware of the Unexpected: Bimodal Taint AnalysisProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598050(211-222)Online publication date: 12-Jul-2023
  • (2023)Mitigating False Positive Static Analysis Warnings: Progress, Challenges, and OpportunitiesIEEE Transactions on Software Engineering10.1109/TSE.2023.332966749:12(5154-5188)Online publication date: Dec-2023
  • (2022)Adapting Static Taint Analyzers to Software MarketplacesProceedings of the 2022 ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses10.1145/3560835.3564553(73-82)Online publication date: 11-Nov-2022
  • (2022)HiddenCPG: Large-Scale Vulnerable Clone Detection Using Subgraph Isomorphism of Code Property GraphsProceedings of the ACM Web Conference 202210.1145/3485447.3512235(755-766)Online publication date: 25-Apr-2022
  • (2022)A Sanitizer-centric Analysis to Detect Cross-Site Scripting in PHP Programs2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE55969.2022.00042(355-365)Online publication date: Oct-2022
  • (2022)InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)10.1109/ICSE-SEIP55303.2022.9794015(165-174)Online publication date: May-2022
  • (2021)Solver-Aided Constant-Time Hardware VerificationProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security10.1145/3460120.3484810(429-444)Online publication date: 12-Nov-2021
  • (2020)Probabilistic Naming of Functions in Stripped BinariesProceedings of the 36th Annual Computer Security Applications Conference10.1145/3427228.3427265(373-385)Online publication date: 7-Dec-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media