Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3358504.3361228acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Scalable comparison of JavaScript V8 bytecode traces

Published: 22 October 2019 Publication History

Abstract

The comparison and alignment of runtime traces are essential, e.g., for semantic analysis or debugging. However, naive sequence alignment algorithms cannot address the needs of the modern web: (i) the bytecode generation process of V8 is not deterministic; (ii) bytecode traces are large.
We present STRAC, a scalable and extensible tool tailored to compare bytecode traces generated by the V8 JavaScript engine. Given two V8 bytecode traces and a distance function between trace events, STRAC computes and provides the best alignment. The key insight is to split access between memory and disk. STRAC can identify semantically equivalent web pages and is capable of processing huge V8 bytecode traces whose order of magnitude matches today's web like https://2019.splashcon.org, which generates approx. 150k of V8 bytecode instructions.

References

[1]
Ivan Beschastnikh, Yuriy Brun, Sigurd Schneider, Michael Sloan, and Michael D. Ernst. 2011. Leveraging Existing Instrumentation to Automatically Infer Invariant-Constrained Models. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering - SIGSOFT/FSE ’11 (2011). ACM Press, 267.
[2]
Berkeley Churchill, Oded Padon, Rahul Sharma, and Alex Aiken. 2019. Semantic Program Alignment for Equivalence Checking. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). ACM, New York, NY, USA, 1027–1040.
[3]
Numpy community. 2018. Numeric python. https://www.numpy.org/ index.html
[4]
V8 JavaScript engine. 2016. Ignition design documentation. https: //v8.dev/docs/ignition
[5]
Y. Fang, C. Huang, L. Liu, and M. Xue. 2018. Research on Malicious JavaScript Detection Technology Based on LSTM. IEEE Access 6 (2018), 59118–59125.
[6]
Toni Giorgino. 2009. Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package. Journal of Statistical Software, Articles 31, 7 (2009), 1–24.
[7]
F. Itakura. 1975. Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 23, 1 (February 1975), 67–72.
[8]
G. Jiang, H. Chen, C. Ungureanu, and K. Yoshihira. 2007. Multiresolution Abnormal Trace Detection Using Varied-Length n-Grams and Automata. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37, 1 (Jan 2007), 86–97.
[9]
T. Kamiya. 2018. Code difference visualization by a call tree. In 2018 IEEE 12th International Workshop on Software Clones (IWSC). 60–63.
[10]
Ulf Kargén and Nahid Shahmehri. 2017. Towards Robust Instructionlevel Trace Alignment of Binary Code. In Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, Piscataway, NJ, USA, 342–352. http://dl.acm. org/citation.cfm?id=3155562.3155608
[11]
Hyunjoo Kim, Jonghyun Kim, Youngsoo Kim, Ikkyun Kim, Kuinam J. Kim, and Hyuncheol Kim. 2017. Improvement of malware detection and classification using API call sequence alignment and visualization. Cluster Computing (12 Sep 2017).
[12]
Daniel Lemire. 2008. Faster Retrieval with a Two-Pass Dynamic-TimeWarping Lower Bound. CoRR abs/0811.3301 (2008). arXiv: 0811.3301 http://arxiv.org/abs/0811.3301
[13]
Y. Lou, H. Ao, and Y. Dong. 2015. Improvement of Dynamic Time Warping (DTW) Algorithm. In 2015 14th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES). 384–387.
[14]
Marcelo De A. Maia, Victor Sobreira, Klérisson R. Paixão, Ra A. De Amo, and Ilmério R. Silva. 2008. Using a sequence alignment algorithm to identify specific and common code from execution traces. In Proceedings of the 4th International Workshop on Program Comprehension through Dynamic Analysis (PCODA. 6–10.
[15]
R. M. Martins and A. Kerren. 2018. Efficient Dynamic Time Warping for Big Data Streams. In 2018 IEEE International Conference on Big Data (Big Data). 2924–2929.
[16]
Ross McIlroy. 2016. Ignition: V8 Interpreter. https://docs.google.com/document/d/ 11T2CRex9hXxoJwbYqVQ32yIPMh0uouUZLdyrtmMoL44/edit
[17]
L. Moreno, J. J. Treadway, A. Marcus, and W. Shen. 2014. On the Use of Stack Traces to Improve Text Retrieval-Based Bug Localization. In 2014 IEEE International Conference on Software Maintenance and Evolution. 151–160.
[18]
Saul B. Needleman and Christian D. Wunsch. 1970. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. 48, 3 (1970), 443–453.
[19]
V8 official web page. 2019. V8 JavaScript Engine. https://v8.dev/
[20]
Izaskun Oregi, Aritz Pérez, Javier Del Ser, and José A. Lozano. 2017. OnLine Dynamic Time Warping for Streaming Time Series. In Machine Learning and Knowledge Discovery in Databases, Michelangelo Ceci, Jaakko Hollmén, Ljupco Todorovski, and Saso Vens, Celinand Dzeroski (Eds.). Springer International Publishing, Cham, 591–605.
[21]
The Chromium Projects. 2019. Run Chromium with Flags - The Chromium Projects. https://www.chromium.org/developers/howtos/run-chromium-with-flags#TOC-V8-Flags
[22]
David A Ramos and Dawson R. Engler. 2011. Practical, Low-effort Equivalence Verification of Real Code. In Proceedings of the 23rd International Conference on Computer Aided Verification (CAV’11). SpringerVerlag, Berlin, Heidelberg, 669–685. http://dl.acm.org/citation.cfm? id=2032305.2032360
[23]
Paruj Ratanaworabhan, Benjamin Livshits, and Benjamin G. Zorn. 2010. JSMeter: Comparing the Behavior of JavaScript Benchmarks with Real Web Applications. In Proceedings of the 2010 USENIX Conference on Web Application Development (WebApps’10). USENIX Association, Berkeley, CA, USA, 3–3. http://dl.acm.org/citation.cfm?id=1863166.1863169
[24]
H. Sakoe and S. Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1 (February 1978), 43–49.
[25]
Stan Salvador and Philip Chan. 2007. FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space. Intell. Data Anal. 11, 5 (Oct. 2007), 561–580. http://dl.acm.org/citation.cfm?id=1367985. 1367993
[26]
Koushik Sen, Swaroop Kalasapur, Tasneem Brutch, and Simon Gibbs. 2013. Jalangi: A Selective Record-replay and Dynamic Analysis Framework for JavaScript. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, USA, 488–498.
[27]
Ryo Suzuki, Gustavo Soares, Andrew Head, Elena Glassman, Ruan Reis, Melina Mongiovi, Loris D’Antoni, and Bjoern Hartmann. 2017. TraceDiff: Debugging Unexpected Code Behavior Using Trace Divergences. CoRR abs/1708.03786 (2017). arXiv: 1708.03786 http: //arxiv.org/abs/1708.03786
[28]
Toon Verwaest and Marja Hölttä. 2019. Blazingly Fast Parsing, Part 2: Lazy Parsing · V8. https://v8.dev/blog/preparser
[29]
M. Weber, R. Brendel, and H. Brunst. 2012. Trace File Comparison with a Hierarchical Sequence Alignment Algorithm. In 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications. 247–254.

Cited By

View all
  • (2023)Augmenting Diffs With Runtime InformationIEEE Transactions on Software Engineering10.1109/TSE.2023.332425849:11(4988-5007)Online publication date: Nov-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
VMIL 2019: Proceedings of the 11th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages
October 2019
66 pages
ISBN:9781450369879
DOI:10.1145/3358504
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bytecode
  2. JavaScript
  3. Sequence alignment
  4. Similarity measurement
  5. V8

Qualifiers

  • Research-article

Conference

SPLASH '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 4 of 4 submissions, 100%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Augmenting Diffs With Runtime InformationIEEE Transactions on Software Engineering10.1109/TSE.2023.332425849:11(4988-5007)Online publication date: Nov-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media