Abstract
Current solutions for automated vulnerability discovery increase coverage but typically do not interact with the web application. Thus, vulnerabilities in code for handling user interactions often remain undiscovered. This paper evaluates interactive strategies that simulate user interaction to increase client-side JavaScript code coverage. We exemplarily analyze 5 widely deployed, real-world web applications and find that simple random walks can double the number of covered branches compared to merely waiting for the page to be loaded (“load-and-wait”). Additionally, we propose novel approaches relying on state-independent models and demonstrate that these outperform the non-interactive baseline by \(2.4{\times }\) in terms of covered branches and \(3.1{\times }\) in terms of discovered data flows. Our interactive strategies have revealed a client-side data flow in SuiteCRM that is exploitable as a stored XSS and SSRF attack but cannot be found without user interaction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Artzi, S., Dolby, J., Jensen, S.H., Møller, A., Tip, F.: A framework for automated testing of JavaScript web applications. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 571–580 (2011)
Bau, J., Bursztein, E., Gupta, D., Mitchell, J.: State of the art: automated black-box web application vulnerability testing. In: Proceedings of the IEEE Symposium on Security and Privacy (S &P), pp. 332–345 (2010)
Bensalim, S., Klein, D., Barber, T., Johns, M.: Talking about my generation: targeted DOM-based XSS exploit generation using dynamic data flow analysis. In: Proceedings of the European Workshop on System Security (EUROSEC) (2021)
coder/code-server. https://github.com/coder/code-server
Demir, N., Große-Kampmann, M., Urban, T., Wressnegger, C., Holz, T., Pohlmann, N.: Reproducibility and replicability of web measurement studies. In: Proceedings of the ACM Web Conference (WWW) (2022)
Doupé, A., Cavedon, L., Kruegel, C., Vigna, G.: Enemy of the state: a state-aware black-box web vulnerability scanner. In: Proceedings of the USENIX Security Symposium, pp. 523–538 (2012)
Eriksson, B., Pellegrino, G., Sabelfeld, A.: Black widow: blackbox data-driven web scanning. In: Proceedings of the IEEE Symposium on Security and Privacy (S &P), pp. 1125–1142 (2021)
Facebook: “client-side” CSRF (2018). https://web.archive.org/web/20180513184714/https://www.facebook.com/notes/facebook-bug-bounty/client-side-csrf/2056804174333798/
Ferruci, F., Sarro, F., Ronca, D., Abrahão, S.: A Crawljax based approach to exploit traditional accessibility evaluation tools for AJAX applications. In: D’Atri, A., Ferrara, M., George, J.F., Spagnoletti, P. (eds.) Information Technology and Innovation Trends in Organizations. Physica, Heidelberg (2011)
Gross, F., Fraser, G., Zeller, A.: EXSYST: search-based GUI testing. In: Proceedings of the International Conference on Software Engineering (ICSE) (2012)
Ihm, S., Pai, V.S.: Towards understanding modern web traffic. In: Proceedings of the Internet Measurement Conference (IMC), pp. 295–312 (2011)
Istanbul, a JavaScript test coverage tool. https://istanbul.js.org/
jgraph/docker-drawio. https://github.com/jgraph/docker-drawio
Jonker, H., Karsch, S., Krumnow, B., Sleegers, M.: Shepherd: a generic approach to automating website login. In: MADWeb 2020 (2020)
Kang, Z., Song, D., Cao, Y.: Probe the proto: measuring client-side prototype pollution vulnerabilities of one million real-world websites. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2022)
Khodayari, S., Pellegrino, G.: JAW: studying client-side CSRF with hybrid property graphs and declarative traversals. In: Proceedings of the USENIX Security Symposium, pp. 2525–2542 (2021)
Khodayari, S., Pellegrino, G.: It’s (DOM) clobbering time: attack techniques, prevalence, and defenses. In: Proceedings of the IEEE Symposium on Security and Privacy (S &P) (2023)
KirstenS: Cross site request forgery (CSRF). https://owasp.org/www-community/attacks/csrf
KirstenS: Cross site scripting (XSS). https://owasp.org/www-community/attacks/xss/
Klein, A.: DOM based cross site scripting or XSS of the third kind. Web Application Security Consortium (2005)
Klein, D., Musch, M., Barber, T., Kopmann, M., Johns, M.: Accept all exploits: exploring the security impact of cookie banners. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp. 911–922 (2022)
Lekies, S., Stock, B., Johns, M.: 25 million flows later: large-scale detection of DOM-based XSS. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS), pp. 1193–1204 (2013)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Phys. 10, 707–710 (1966)
McAllister, S., Kirda, E., Kruegel, C.: Leveraging user interactions for in-depth testing of web applications. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 191–210. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87403-4_11
Melicher, W., Das, A., Sharif, M., Bauer, L., Jia, L.: Riding out DOMsday: towards detecting and preventing DOM cross-site scripting. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2018)
Melicher, W., Fung, C., Bauer, L., Jia, L.: Towards a lightweight, hybrid approach for detecting DOM XSS vulnerabilities with machine learning. In: Proceedings of the ACM Web Conference (WWW), pp. 2684–2695 (2021)
Mesbah, A., van Deursen, A., Lenselink, S.: Crawling Ajax-based web applications through dynamic analysis of user interface state changes. ACM Trans. Web 6(1), 1–30 (2012)
Mesbah, A., Prasad, M.R.: Automated cross-browser compatibility testing. In: Proceedings of the International Conference on Software Engineering (ICSE) (2011)
Odoo: Open source ERP and CRM. https://www.odoo.com
ownCloud GmbH: ownCloud. https://owncloud.com
Parameshwaran, I., Budianto, E., Shinde, S., Dang, H., Sadhu, A., Saxena, P.: DexterJS: robust testing platform for DOM-based XSS vulnerabilities. In: Proceedings of the Joint Meeting on Foundations of Software Engineering, pp. 946–949 (2015)
Park, J., Lim, I., Ryu, S.: Battles with false positives in static analysis of JavaScript web applications in the wild. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 61–70 (2016)
Pellegrino, G., Tschürtz, C., Bodden, E., Rossow, C.: jÄk: using dynamic analysis to crawl and test modern web applications. In: Bos, H., Monrose, F., Blanc, G. (eds.) RAID 2015. LNCS, vol. 9404, pp. 295–316. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26362-5_14
Ratanaworabhan, P., Livshits, B., Zorn, B.G.: JSMeter: comparing the behavior of JavaScript benchmarks with real web applications. In: USENIX Conference on Web Application Development (WebApps) (2010)
Ratcliff, J.W., Metzener, D.E.: Pattern-matching - the gestalt approach. Dr. Dobbs J. 13(7), 46 (1988)
Richards, G., Lebresne, S., Burg, B., Vitek, J.: An analysis of the dynamic behavior of JavaScript programs. In: Proceedings of the ACM SIGPLAN International Conference on Programming Languages Design and Implementation (PLDI), pp. 1–12 (2010)
Roest, D., Mesbah, A., van Deursen, A.: Regression testing ajax applications: coping with dynamism. In: Proceedings of the International Conference on Software Testing, Verification and Validation (ICST), pp. 127–136 (2010)
SalesAgility: SuiteCRM. https://suitecrm.com
SAP/project-foxhound. https://github.com/SAP/project-foxhound
Saxena, P., Hanna, S., Poosankam, P., Song, D.: FLAX: systematic discovery of client-side validation vulnerabilities in rich web applications. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2010)
Steffens, M., Rossow, C., Johns, M., Stock, B.: Don’t trust the locals: investigating the prevalence of persistent client-side cross-site scripting in the wild. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2019)
Stewart, S., Burns, D.: WebDriver. W3C working draft, W3C (2022)
Stock, B., Johns, M., Steffens, M., Backes, M.: How the web tangled itself: uncovering the history of client-side web (in)security. In: Proceedings of the USENIX Security Symposium, pp. 971–987 (2017)
Stock, B., Lekies, S., Mueller, T., Spiegel, P., Johns, M.: Precise client-side protection against DOM-based cross-site scripting. In: Proceedings of the USENIX Security Symposium, pp. 655–670 (2014)
Stock, B., Pfistner, S., Kaiser, B., Lekies, S., Johns, M.: From facepalm to brain bender: exploring client-side cross-site scripting. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS), pp. 1419–1430 (2015)
The Selenium Project: Selenium (2022). https://www.selenium.dev/
WHATWG: HTML living standard (2022). https://html.spec.whatwg.org/
Zheng, Y., et al.: Automatic web testing using curiosity-driven reinforcement learning. In: Proceedings of the International Conference on Software Engineering (ICSE), pp. 423–435 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Heuristic Candidate Weights
In this section we describe how we chose the parameterized weight values for the random-walk+heuristics strategy (see Sect. 4.2) by performing experiments on the code-server application. We vary each parameter individually while setting the remaining parameters to 1.0, effectively disabling all but one heuristic. For each configuration of the weight parameters we perform 6 runs of 10 min, with the results shown in Fig. 8. For comparison, the dashed line shows the mean number of branches covered by the random-walk strategy as a baseline.
The parameter with the most significant effect on the coverage is \(w^\textrm{unexplored}\). For \(w^\textrm{unexplored} = 15\) we see an improvement of the mean number of covered branches by 9.6% compared to the baseline. For choices below and above 15 the gain compared to the random-walk strategy with regard to covered branches and number of unique candidates is smaller. We therefore set \(w^\textrm{unexplored} :=15\) for our experimental evaluation. For the parameter \(w^\textrm{coverage}\), we choose \(w^\textrm{coverage} :=2\) for the evaluation as it provides a small increase (3.9%) compared to the baseline. For the parameters \(w^\textrm{errored}\) and \(w^\textrm{pageload}\) we observe a slight improvement compared to the baseline for all parameter values. However, the significance of these observations is limited since the number of actions that resulted in an error or a page-load is very small: across the 6 baseline runs, only 5.7 actions resulted in an error on average and only 2 actions led to a pageload in total. Since actions resulting in an error or a pageload typically do not contribute any new code coverage, we set \(w^\textrm{errored} = w^\textrm{pageload} :=0.1\) in our experiments.
B State Similarity Threshold
To choose an appropriate value for \(\varDelta _\textrm{min}\) (see Sect. 4.3), we performed 3 runs on the code-server application (15 min each) for different possible values. If \(\varSigma _s :=\{ c \in \varSigma \mid \lambda (s, c) > 0 \}\) is the set of action candidates that are available in state s, the mean candidate probability \(\overline{\lambda }(s)\) of state s is defined as follows:
Intuitively, \(\overline{\lambda }(s)\) is a measure of the model’s certainty about the action candidates that will be available in state s. If \(\overline{\lambda }(s)\) is high, the model is good at predicting the candidates, while a low \(\overline{\lambda }(s)\) indicates a high uncertainty of the model.
Figure 9 shows the mean candidate probability, the number of visits to each state and the number of states encountered during the analysis. While the mean candidate probability increases with \(\varDelta _\textrm{min}\), a larger \(\varDelta _\textrm{min}\) leads to fewer visits to each individual state and a larger mean number of distinct states per analysis. These results are intuitively expected: the higher \(\varDelta _\textrm{min}\) is chosen, the more likely the model is to create a new state instead of considering a DOM token sequence to belong to a known state. This leads to a larger number of total states, while reducing the number of visits per state. Additionally, a higher similarity between DOM trees in the same state also increases the likelihood that the DOM trees contain the same action candidates, thus leading to a larger mean candidate probability. We therefore set \(\varDelta _\textrm{min} :=0.9\) for the evaluation, which yields a good mean candidate probability of 0.8 and results in 4 visits to each state on average.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Weidmann, N., Barber, T., Wressnegger, C. (2023). Load-and-Act: Increasing Page Coverage of Web Applications. In: Athanasopoulos, E., Mennink, B. (eds) Information Security. ISC 2023. Lecture Notes in Computer Science, vol 14411. Springer, Cham. https://doi.org/10.1007/978-3-031-49187-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-49187-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49186-3
Online ISBN: 978-3-031-49187-0
eBook Packages: Computer ScienceComputer Science (R0)