Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3540250.3558944acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Input splitting for cloud-based static application security testing platforms

Published: 09 November 2022 Publication History

Abstract

As software development teams adopt DevSecOps practices, application security is increasingly the responsibility of development teams, who are required to set up their own Static Application Security Testing (SAST) infrastructure.
Since development teams often do not have the necessary infrastructure and expertise to set up a custom SAST solution, there is an increased need for cloud-based SAST platforms that operate as a service and run a variety of static analyzers. Adding a new static analyzer to a cloud-based SAST platform can be challenging because static analyzers greatly vary in complexity, from linters that scale efficiently to interprocedural dataflow engines that use cubic or even more complex algorithms. Careful manual evaluation is needed to decide whether a new analyzer would slow down the overall response time of the platform or may timeout too often.
We explore the question of whether this can be simplified by splitting the input to the analyzer into partitions and analyzing the partitions independently. Depending on the complexity of the static analyzer, the partition size can be adjusted to curtail the overall response time. We report on an experiment where we run different analysis tools with and without splitting the inputs. The experimental results show that simple splitting strategies can effectively reduce the running time and memory usage per partition without significantly affecting the findings produced by the tool.

References

[1]
Aws Albarghouthi, Rahul Kumar, Aditya V. Nori, and Sriram K. Rajamani. 2012. Parallelizing top-down interprocedural analyses. In PLDI. ACM, 217–228. https://doi.org/10.1145/2254064.2254091
[2]
Steven Arzt and Eric Bodden. 2014. Reviser: Efficiently Updating IDE-/IFDS-Based Data-Flow Analyses in Response to Incremental Program Changes. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). Association for Computing Machinery, New York, NY, USA. 288–298. isbn:9781450327565 https://doi.org/10.1145/2568225.2568243
[3]
Steven Arzt and Eric Bodden. 2016. StubDroid: Automatic Inference of Precise Data-Flow Summaries for the Android Framework. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 725–735. https://doi.org/10.1145/2884781.2884816
[4]
Python Code Quality Authority. 2008. Bandit. https://bandit.readthedocs.io/en/latest/
[5]
Brenda S Baker and Edward G Coffman, Jr. 1981. A tight asymptotic bound for next-fit-decreasing bin-packing. SIAM Journal on Algebraic Discrete Methods, 2, 2 (1981), 147–152. https://doi.org/10.1137/0602019
[6]
Vipin Balachandran. 2013. Reducing Human Effort and Improving Quality in Peer Code Reviews Using Automatic Static Analysis and Reviewer Recommendation. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). IEEE Press, 931–940. isbn:9781467330763 https://doi.org/10.1109/ICSE.2013.6606642
[7]
Jiri Barnat, Lubos Brim, and Jitka Stríbrná. 2001. Distributed LTL Model-Checking in SPIN. In SPIN (LNCS, Vol. 2057). Springer, 200–216. https://doi.org/10.1007/3-540-45139-0_13
[8]
Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Specification Inference Using Context-Free Language Reachability. In POPL. ACM, 553–566. https://doi.org/10.1145/2676726.2676977
[9]
Andrew Binstock. 2022. Gitleaks: a SAST tool for detecting and preventing hardcoded secrets like passwords, api keys, and tokens in git repositories. https://blogs.oracle.com/javamagazine/post/java-class-file-constant-pool
[10]
Martin Blais. 2007. Snakefood. https://furius.ca/snakefood/doc/snakefood-doc.html
[11]
Cristiano Calcagno, Dino Distefano, Jérémy Dubreil, Dominik Gabi, Pieter Hooimeijer, Martino Luca, Peter W. O’Hearn, Irene Papakonstantinou, Jim Purbrick, and Dulma Rodriguez. 2015. Moving Fast with Software Verification. In NFM (LNCS, Vol. 9058). Springer, 3–11. https://doi.org/10.1007/978-3-319-17524-9_1
[12]
Justin Collins. 2022. Brakeman: a static vulnerability scanner specifically designed for Ruby on Rails applications. https://brakemanscanner.org/
[13]
Christopher L. Conway, Kedar S. Namjoshi, Dennis Dams, and Stephen A. Edwards. 2005. Incremental Algorithms for Inter-procedural Analysis of Safety Properties. In Computer Aided Verification, Kousha Etessami and Sriram K. Rajamani (Eds.). Springer, 449–461. isbn:978-3-540-31686-2 https://doi.org/10.1007/11513988_45
[14]
Utkarsh Desai, Sambaran Bandyopadhyay, and Srikanth Tamilselvam. 2021. Graph Neural Network to Dilute Outliers for Refactoring Monolith Application. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 72–80. https://ojs.aaai.org/index.php/AAAI/article/view/16079
[15]
Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith, and Emerson Murphy-Hill. 2017. Just-in-Time Static Analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). Association for Computing Machinery, New York, NY, USA. 307–317. isbn:9781450350761 https://doi.org/10.1145/3092703.3092705
[16]
Michael Emmi, Liana Hadarean, Ranjit Jhala, Lee Pike, Nicolás Rosner, Martin Schäf, Aritra Sengupta, and Willem Visser. 2021. RAPID: Checking API Usage for the Cloud in the Cloud. ACM, New York, NY, USA. 1416–1426. isbn:9781450385626 https://doi.org/10.1145/3468264.3473934
[17]
Martin DeMello et al. 2017. Importlab. https://github.com/google/importlab
[18]
Cormac Flanagan, K Rustan M Leino, Mark Lillibridge, Greg Nelson, James B Saxe, and Raymie Stata. 2002. Extended static checking for Java. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation. 234–245. https://doi.org/10.1145/512529.512558
[19]
Jonas Fritzsch, Justus Bogner, Alfred Zimmermann, and Stefan Wagner. 2018. From Monolith to Microservices: A Classification of Refactoring Approaches. In Software Engineering Aspects of Continuous Development and New Paradigms of Software Production and Deployment - First International Workshop, DEVOPS 2018, Chateau de Villebrumier, France, March 5-6, 2018, Revised Selected Papers, Jean-Michel Bruel, Manuel Mazzara, and Bertrand Meyer (Eds.) (Lecture Notes in Computer Science, Vol. 11350). Springer, 128–141. https://doi.org/10.1007/978-3-030-06019-0_10
[20]
Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits. 2017. Toward Full Elasticity in Distributed Static Analysis: The Case of Callgraph Analysis. ESEC/FSE 2017. Association for Computing Machinery, New York, NY, USA. 442–453. isbn:9781450351058 https://doi.org/10.1145/3106237.3106261
[21]
Emmanuel Geay, Eran Yahav, and Stephen Fink. 2006. Continuous code-quality assurance with SAFE. In Proceedings of the 2006 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation. 145–149. https://doi.org/10.1145/1111542.1111567
[22]
Orna Grumberg, Tamir Heyman, Nili Ifergan, and Assaf Schuster. 2005. Achieving Speedups in Distributed Symbolic Reachability Analysis Through Asynchronous Computation. In IFIP (LNCS, Vol. 3725). Springer, 129–145. https://doi.org/10.1007/11560548_12
[23]
Nevin Heintze and David A. McAllester. 1997. On the Cubic Bottleneck in Subtyping and Flow Analysis. In LICS. IEEE, 342–351. https://doi.org/10.1109/LICS.1997.614960
[24]
Susan Horwitz, Thomas W. Reps, and David W. Binkley. 1988. Interprocedural Slicing Using Dependence Graphs. In PLDI. ACM, 35–46. https://doi.org/10.1145/1328438.1328464
[25]
Di Jin, Zhizhi Yu, Pengfei Jiao, Shirui Pan, Dongxiao He, Jia Wu, Philip Yu, and Weixiong Zhang. 2021. A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning. IEEE Transactions on Knowledge and Data Engineering, https://doi.org/10.1109/TKDE.2021.3104155
[26]
David S Johnson. 1973. Near-optimal bin packing algorithms. Ph.D. Dissertation. Massachusetts Institute of Technology.
[27]
Anup K. Kalia, Jin Xiao, Rahul Krishna, Saurabh Sinha, Maja Vukovic, and Debasish Banerjee. 2021. Mono2Micro: A Practical and Effective Tool for Decomposing Monolithic Java Applications to Microservices. Association for Computing Machinery, New York, NY, USA. 1214–1224. isbn:9781450385626 https://doi.org/10.1145/3468264.3473915
[28]
John Kodumal and Alexander Aiken. 2004. The set constraint/CFL reachability connection in practice. In PLDI. ACM, 207–218. https://doi.org/10.1145/996841.996867
[29]
Rahul Kumar and Eric G. Mercer. 2005. Load Balancing Parallel Explicit State Model Checking. ENTCS, 128 (2005), 19–34. https://doi.org/10.1016/j.entcs.2004.10.016
[30]
James A. Kupsch, Barton P. Miller, Vamshi Basupalli, and Josef Burger. 2017. From continuous integration to continuous assurance. In 2017 IEEE 28th Annual Software Technology Conference (STC). 1–8. https://doi.org/10.1109/STC.2017.8234450
[31]
Yi Lu, Lei Shang, Xinwei Xie, and Jingling Xue. 2013. An Incremental Points-to Analysis with CFL-Reachability. In CC (LNCS, Vol. 7791). Springer, 61–81. https://doi.org/10.1007/978-3-642-37051-9_4
[32]
Silvano Martello and Paolo Toth. 1990. Knapsack problems: algorithms and computer implementations. John Wiley & Sons, Inc.
[33]
Maven. 2022. List of Maven Packages. https://gist.github.com/linghuiluo/1b82866051e4c4ebb0fd065549f60100
[34]
Genc Mazlami, Jürgen Cito, and Philipp Leitner. 2017. Extraction of Microservices from Monolithic Software Architectures. In 2017 IEEE International Conference on Web Services, ICWS 2017, Honolulu, HI, USA, June 25-30, 2017, Ilkay Altintas and Shiping Chen (Eds.). IEEE, 524–531. https://doi.org/10.1109/ICWS.2017.61
[35]
Mario Méndez-Lojo, Augustine Mathew, and Keshav Pingali. 2010. Parallel inclusion-based points-to analysis. In OOPSLA. ACM, 428–443. https://doi.org/10.1145/1869459.1869495
[36]
Meta. 2022. Infer: a static analysis platform for Java, C, and Objective-C. https://fbinfer.com/docs/about-Infer
[37]
Mangala Gowri Nanda, Monika Gupta, Saurabh Sinha, Satish Chandra, David Schmidt, and Pradeep Balachandran. 2010. Making Defect-Finding Tools Work for You. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2 (ICSE ’10). Association for Computing Machinery, New York, NY, USA. 99–108. isbn:9781605587196 https://doi.org/10.1145/1810295.1810310
[38]
Mangala Gowri Nanda and Saurabh Sinha. 2009. Accurate interprocedural null-dereference analysis for Java. In 2009 IEEE 31st International Conference on Software Engineering. 133–143. https://doi.org/10.1109/ICSE.2009.5070515
[39]
NIST. 2022. Juliet Test Suite for Java. https://samate.nist.gov/SRD/testsuite.php
[40]
Oracle. 2022. JDeps - Java Platform, Standard Edition Tools Reference. https://docs.oracle.com/javase/9/tools/jdeps.htm
[41]
OWASP. 2022. FindSecBugs: the SpotBugs plugin for security audits of Java web applications. https://find-sec-bugs.github.io/
[42]
OWASP. 2022. OWASP. https://owasp.org/www-project-benchmark/
[43]
Praetorian, Inc. 2021. Gokart: a security-oriented static analysis for Golang with a focus on minimizing false positives. https://github.com/praetorian-inc/gokart/
[44]
Thomas Reps, Susan Horwitz, and Mooly Sagiv. 1995. Precise Interprocedural Dataflow Analysis via Graph Reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’95). Association for Computing Machinery, New York, NY, USA. 49–61. isbn:0897916921 https://doi.org/10.1145/199448.199462
[45]
Thomas W. Reps. 1997. Program Analysis via Graph Reachability. In ISLP. MIT, 5–19. https://doi.org/10.7551/mitpress/4283.003.0008
[46]
Zachary Rice. 2018. Understanding the constant pool inside a Java class file. https://github.com/zricethezav/gitleaks/
[47]
Jonathan Rodriguez and Ondrej Lhoták. 2011. Actor-Based Parallel Dataflow Analysis. In CC (LNCS, Vol. 6601). Springer, 179–197. https://doi.org/10.1007/978-3-642-19861-8_11
[48]
Atanas Rountev, Mariana Sharp, and Guoqing Xu. 2008. IDE Dataflow Analysis in the Presence of Large Object-Oriented Libraries. In Compiler Construction, Laurie Hendren (Ed.). Springer, 53–68. isbn:978-3-540-78791-4 https://doi.org/10.1007/978-3-540-78791-4_4
[49]
Caitlin Sadowski, Jeffrey van Gogh, Ciera Jaspan, Emma Söderberg, and Collin Winter. 2015. Tricorder: Building a Program Analysis Ecosystem. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, 598–608. isbn:9781479919345 https://doi.org/10.1109/ICSE.2015.76
[50]
Amazon Web Services. 2022. Elastic Compute Cloud (EC2) Pricing. https://aws.amazon.com/ec2/pricing/
[51]
Gagandeep Singh, Markus Püschel, and Martin T. Vechev. 2017. Fast polyhedra abstract domain. In POPL. ACM, 46–59. https://doi.org/10.1145/3009837.3009885
[52]
SonarSource, S.A. 2008. Sonarqube: a Static Application Security Testing (SAST) solution to detect security issues in code review. https://www.sonarqube.org/features/security/
[53]
Yu Su, Ding Ye, and Jingling Xue. 2014. Parallel Pointer Analysis with CFL-Reachability. In ICPP. IEEE Computer Society, 451–460. https://doi.org/10.1109/ICPP.2014.54
[54]
David Trabish, Andrea Mattavelli, Noam Rinetzky, and Cristian Cadar. 2018. Chopped symbolic execution. In ICSE. ACM, 350–360. https://doi.org/10.1145/3180155.3180251
[55]
Jens Van der Plas, Quentin Stiévenart, Noah Van Es, and Coen De Roover. 2020. Incremental Flow Analysis through Computational Dependency Reification. In 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM). 25–36. https://doi.org/10.1109/SCAM51674.2020.00008
[56]
Dimitrios Vardoulakis and Olin Shivers. 2010. CFA2: A Context-Free Approach to Control-Flow Analysis. In ESOP (LNCS, Vol. 6012). Springer, 570–589. https://doi.org/10.1007/978-3-642-11957-6_30
[57]
Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A Single-Machine Disk-Based Graph System for Interprocedural Static Analyses of Large-Scale Systems Code. In ASPLOS. ACM, 389–404. https://doi.org/10.1145/3037697.3037744
[58]
Hao Yuan and Patrick Th. Eugster. 2009. An Efficient Algorithm for Solving the Dyck-CFL Reachability Problem on Trees. In ESOP (LNCS, Vol. 5502). Springer, 175–189. https://doi.org/10.1007/978-3-642-00590-9_13
[59]
Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. In POPL. ACM, 197–208. https://doi.org/10.1145/1328438.1328464
[60]
Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, and Xuandong Li. 2019. BigSpa: An Efficient Interprocedural Static Analysis Engine in the Cloud. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 771–780. https://doi.org/10.1109/IPDPS.2019.00086
[61]
Zhiqiang Zuo, John Thorpe, Yifei Wang, Qiuhong Pan, Shenming Lu, Kai Wang, Guoqing Harry Xu, Linzhang Wang, and Xuandong Li. 2019. Grapple: A Graph System for Static Finite-State Property Checking of Large-Scale Systems Code. In EuroSys. ACM. https://doi.org/10.1145/3302424.3303972

Cited By

View all

Index Terms

  1. Input splitting for cloud-based static application security testing platforms
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    November 2022
    1822 pages
    ISBN:9781450394130
    DOI:10.1145/3540250
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 November 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. API usage checking
    2. software security
    3. static analysis in the cloud

    Qualifiers

    • Research-article

    Conference

    ESEC/FSE '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 112 of 543 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)203
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media