Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2541940.2541989acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation

Published: 24 February 2014 Publication History

Abstract

Finite-State Machine (FSM) applications are important for many domains. But FSM computation is inherently sequential, making such applications notoriously difficult to parallelize. Most prior methods address the problem through speculations on simple heuristics, offering limited applicability and inconsistent speedups.
This paper provides some principled understanding of FSM parallelization, and offers the first disciplined way to exploit application-specific information to inform speculations for parallelization. Through a series of rigorous analysis, it presents a probabilistic model that captures the relations between speculative executions and the properties of the target FSM and its inputs. With the formulation, it proposes two model-based speculation schemes that automatically customize themselves with the suitable configurations to maximize the parallelization benefits. This rigorous treatment yields near-linear speedup on applications that state-of-the-art techniques can barely accelerate.

References

[1]
R. Bodik.Browsing web 3.0 on 3.0 watts: Why browsers will be parallel and implications for education. Invited talk at The 3rd Workshop on Software Tools for MultiCore Systems, April, 2008.
[2]
Regular expression.texttt http://www.regular-expressions.info.
[3]
M. Abadi, A. Birrell, T. Harris, and M. Isard. Semantics of transactional memory and automatic mutual exclusion. In Proceedings of ACM Symposium on Principles of Programming Languages, 2008.
[4]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, 2nd edition, August 2006.
[5]
B. Alexeev. Minimal DFA for testing divisibility. Journal of Computer and System Sciences, 69(2), 2004.
[6]
K. Asanovic, R. Bodik, B. Catanzaro, J. Gebis, P. Husbands, K. Keutzer, D. Patterson, W. Plishker, J. Shalf, S. Williams, and K. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-18, University of California at Berkele, 2006.
[7]
François Baccelli and Thierry Fleury. On parsing arithmetic expressions in a multiprocessing environment. Acta Inf., 17:287--310, 1982.
[8]
François Baccelli and Philippe Mussi. An asynchronous parallel interpreter for arithmetic expressions and its evaluation. IEEE Trans. Computers, 35(3):245--256, 1986.
[9]
H.C. Bunt and A. Nijholt. Advances in probabilistic and other parsing technologies. Kluwer Academic Publishers, 2000. Chapter 12.
[10]
B. Carlstrom, A. McDonald, H. Chafi, J. Chung, C. Minh, C. Kozyrakis, and K. Olukotun. The atomos transactional programming langauges. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 2006.
[11]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In OOPSLA, 2005.
[12]
C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior-oriented parallelization. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 2007.
[13]
M. Feng, R. Gupta, and Y. Hu. Spicec: Scalable parallelism via implicit copying and explicit commit. In Proceedings of the ACM SIGPLAN Symposium on Principles Practice of Parallel Programming, 2011.
[14]
Charles N. Fischer. On Parsing Context Free Languages in Parallel Environments. PhD thesis, Cornell University, 1975.
[15]
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 1998.
[16]
M. Herlihy and J. E. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), 1993.
[17]
K. Ingham, A. Somayaji, J. Burge, and S. Forrest. Learning DFA representations of HTTP for protecting web applications. Computer Networks, 51(5):1239--1255, April 2007.
[18]
C. Jones, R. Liu, L. Meyerovich, K. Asanovic, and R. Bodik. Parallelizing the web browser. In HotPar, 2009.
[19]
B. Kaplan. Speculative parsing patch. In bugzilla.mozilla.org, 2009.
[20]
S. Klein and Y. Wiseman. Parallel Huffman decoding with applications to JPEG files. Jounal of Computing, 46(5), 2003.
[21]
M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L.P. Chew. Optimistic parallelism requires abstractions. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 2007.
[22]
Richard E. Ladner and Michael J. Fischer. Parallel prefix computation. J. ACM, 27(4):831--838, October 1980.
[23]
D.R. Llanos, D. Orden, and B. Palop. New scheduling strategies for randomized incremental algorithms in the context of speculative parallelization. IEEE Transactions on Computers, 2007.
[24]
Wei Lu, Kenneth Chiu, and Yinfei Pan. A parallel approach to XML parsing. In Proceedings of the 7th IEEE/ACM International Conference on Grid Computing, GRID '06, pages 223--230, 2006.
[25]
D. Luchaup, R. Smith, C. Estan, and S. Jha. Multi-byte regular expression matching with speculation. In RAID, 2009.
[26]
P. Prabhu, G. Ramalingam, and K. Vaswani. Safe programmable speculative parallelism. In Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation, 2010.
[27]
C.G. Quinones, C. Madriles, J. Sanchez, P. Marcuello, A. Gonzalez, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation, 2005.
[28]
A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In Proceedings of the international conference on Architectural support for programming languages and operating systems, 2010.
[29]
J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23(3):253--300, 2005.
[30]
Steven K. Thompson. Sample size for estimating multinomial proportions. The American Statistician, 1987.
[31]
C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In Proceedings of the International Symposium on Microarchitecture, 2008.
[32]
E. Witte, R. Chamberlain, and M. Franklin. Parallel simulated annealing using speculative computation. IEEE Transactions on Parallel and Distributed Systems, 2(4):483--494, 1991.
[33]
Zhiqiang Yu, Yu Wu, Qi Zhang and Jianhui Li. A hybrid parallel processing for XML parsing and schema validation. In Balisage: The Markup Conference 2008.
[34]
Y. Zhang, Y. Pan, and K. Chiu. Speculative p-DFAs for parallel XML parsing. In Proceedings of the International Conference on High Performance Computing, 2009.
[35]
Z. Zhao, B. Wu, and X. Shen. Probabilistic analysis for optimal speculation of finite-state machine applications. Technical Report WM-CS-2014-01, College of William and Mary, 2014.
[36]
Y. Zu, M. Yang, Z. Xu, L. Wang, X. Tian, K. Peng, and Q. Dong. GPU-based NFA implementation for memory efficient high speed regular expression matching. In PPoPP '12: Proceedings of the ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 129--140, 2009.

Cited By

View all
  • (2024)ngAP: Non-blocking Large-scale Automata Processing on GPUsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624848(268-285)Online publication date: 27-Apr-2024
  • (2023)Asynchronous Automata Processing on GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794537:1(1-27)Online publication date: 2-Mar-2023
  • (2023)Parallel Pattern Matching over Brotli Compressed Network Traffic2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00079(477-484)Online publication date: 1-Nov-2023
  • Show More Cited By

Index Terms

  1. Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
    February 2014
    780 pages
    ISBN:9781450323055
    DOI:10.1145/2541940
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 February 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dfa
    2. fsm
    3. lookback
    4. multicore
    5. partial commit
    6. speculative parallelization

    Qualifiers

    • Research-article

    Conference

    ASPLOS '14

    Acceptance Rates

    ASPLOS '14 Paper Acceptance Rate 49 of 217 submissions, 23%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)56
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)ngAP: Non-blocking Large-scale Automata Processing on GPUsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624848(268-285)Online publication date: 27-Apr-2024
    • (2023)Asynchronous Automata Processing on GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794537:1(1-27)Online publication date: 2-Mar-2023
    • (2023)Parallel Pattern Matching over Brotli Compressed Network Traffic2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00079(477-484)Online publication date: 1-Nov-2023
    • (2023)Harry: A Scalable SIMD-based Multi-literal Pattern Matching Engine for Deep Packet InspectionIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10229022(1-10)Online publication date: 17-May-2023
    • (2022)RACODProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527383(597-609)Online publication date: 18-Jun-2022
    • (2021)In-/Near-Memory ComputingSynthesis Lectures on Computer Architecture10.2200/S01109ED1V01Y202106CAC05716:2(1-140)Online publication date: 11-Aug-2021
    • (2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
    • (2021)Plex: Scaling Parallel Lexing with Backtrack-Free Prescanning2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00079(693-702)Online publication date: May-2021
    • (2020)Reliability Analysis for Unreliable FSM ComputationsACM Transactions on Architecture and Code Optimization10.1145/337745617:2(1-23)Online publication date: 29-May-2020
    • (2020)Why GPUs are Slow at Executing NFAs and How to Make them FasterProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378471(251-265)Online publication date: 9-Mar-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media