research-article

Using Declarative Specification to Improve the Understanding, Extensibility, and Comparison of Model-Inference Algorithms

Authors:

Ivan Beschastnikh,

Jenny Abrahamson,

Michael D. Ernst,

Arvind KrishnamurthyAuthors Info & Claims

IEEE Transactions on Software Engineering, Volume 41, Issue 4

Pages 408 - 428

https://doi.org/10.1109/TSE.2014.2369047

Published: 01 April 2015 Publication History

Abstract

It is a staple development practice to log system behavior. Numerous powerful model-inference algorithms have been proposed to aid developers in log analysis and system understanding. Unfortunately, existing algorithms are typically declared procedurally, making them difficult to understand, extend, and compare. This paper presents InvariMint, an approach to specify model-inference algorithms declaratively. We applied the InvariMint declarative approach to two model-inference algorithms. The evaluation results illustrate that InvariMint (1) leads to new fundamental insights and better understanding of existing algorithms, (2) simplifies creation of new algorithms, including hybrids that combine or extend existing algorithms, and (3) makes it easy to compare and contrast previously published algorithms. InvariMint's declarative approach can outperform procedural implementations. For example, on a log of 50,000 events, InvariMint's declarative implementation of the kTails algorithm completes in 12 seconds, while a procedural implementation completes in 18 minutes. We also found that InvariMint's declarative version of the Synoptic algorithm can be over 170 times faster than the procedural implementation.

References

[1]

M. Acharya, T. Xie, J. Pei, and J. Xu, “Mining API patterns as partial orders from source code: From usage scenarios to specifications,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Foundations Softw. Eng., 2007, pp. 25– 34.

[2]

D. Angluin, “Finding patterns common to a set of strings, ” J. Comput. Syst. Sci., vol. 21, no. 1, pp. 46 –62, 1980.

[3]

C. M. Antunes and A. L. Oliveira, “Temporal data mining: An overview,” in Proc. Workshop Temporal Data Mining, 2001.

[4]

A. Bertolino, P. Inverardi, P. Pelliccione, and M. Tivoli, “Automatic synthesis of behavior protocols for composable web-services,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Foundations Softw. Eng., 2009, pp. 141–150.

[5]

I. Beschastnikh, Y. Brun, J. Abrahamson, M. D. Ernst, and A. Krishnamurthy, “Unifying FSM-inference algorithms through declarative specification,” in Proc. IEEE/ACM Int. Conf. Softw. Eng., 2013, pp. 252–261.

[6]

I. Beschastnikh, Y. Brun, M. D. Ernst, and A. Krishnamurthy, “Inferring models of concurrent systems from logs of their behavior with CSight,” in Proc. IEEE/ACM Int. Conf. Softw. Eng., 2014, pp. 468–479.

[7]

I. Beschastnikh, Y. Brun, S. Schneider, M. Sloan, and M. D. Ernst, “Leveraging existing instrumentation to automatically infer invariant-constrained models,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Foundations Softw. Eng., 2011, pp. 267–277.

[8]

A. W. Biermann and J. A. Feldman, “On the synthesis of finite-state machines from samples of their behavior,” IEEE Trans. Comput., vol. 21, no. 6, pp. 592 –597, Jun. 1972.

Digital Library

[9]

B. Bollig, J.-P. Katoen, C. Kern, and M. Leucker, “Learning communicating automata from MSCs,” IEEE Trans. Softw. Eng., vol. 36, no. 3, pp. 390–408, 2010.

Digital Library

[10]

R. C. Carrasco and J. Oncina, “Learning stochastic regular grammars by means of a state merging method,” in Proc. Int. Colloquium Grammatical Inference Appl., 1994, pp. 139 –152.

Digital Library

[11]

E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith, “Counterexample-guided abstraction refinement,” in Proc. Comput. Aided Verification, 2000, pp. 154–169.

[12]

J. E. Cook and A. L. Wolf, “Discovering models of software processes from event-based data,” ACM Trans. Softw. Eng. Methodology, vol. 7, no. 3, pp. 215 –249, 1998.

Digital Library

[13]

M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin, “Dynamically discovering likely program invariants to support program evolution,” IEEE Trans. Softw. Eng., vol. 27, no. 2, pp. 99–123, Feb. 2001.

Digital Library

[14]

M. D. Ernst, R. Lencevicius, and J. F. H. Perkins, “ Detection of web service substitutability and composability,” in Proc. Int. Workshop Web Serv.—Modeling Testing, 2006, pp. 123–135.

[15]

M. Gabel and Z. Su, “Javert: Fully automatic mining of general temporal properties from dynamic traces,” in Proc. ACM SIGSOFT Int. Symp. Found. Softw. Eng., 2008, pp. 339 –349.

[16]

M. Gabel and Z. Su, “Testing mined specifications,” in Proc. ACM SIGSOFT Int. Symp. Found. Softw. Eng., 2012, pp. 4:1–4:11.

[17]

E. M. Gold, “Language identification in the limit,” Inform. Control, vol. 10, no. 5, pp. 447– 474, 1967.

[18]

S. Hangal and M. S. Lam, “Tracking down software bugs using automatic anomaly detection,” in Proc. IEEE/ACM Int. Conf. Softw. Eng., 2002, pp. 291–301.

[19]

J. Hatcliff and M. B. Dwyer, “Using the Bandera tool set to model-check properties of concurrent Java software,” in Proc. Int. Conf. Concurrency Theory, 2001, pp. 39– 58.

[20]

J. E. Hopcroft, “An $n\; \log\; n$ algorithm for minimizing states in a finite automaton,” Stanford Univ., Stanford, CA, USA, Tech. Rep. STAN-CS-71-190, 1971.

[21]

I. Krka, Y. Brun, and N. Medvidovic, “ Automatic mining of specifications from invocation traces and method invariants,” in Proc. ACM SIGSOFT Symp. Found. Softw. Eng., 2014, pp. 178–189 .

[22]

S. Kumar, S.-C. Khoo, A. Roychoudhury, and D. Lo, “Inferring class level specifications for distributed systems,” in Proc. IEEE/ACM Int. Conf. Softw. Eng. , 2012, pp. 914–924.

[23]

W. Li, “Specification mining: New formalisms, algorithms and applications,” Ph.D. thesis, EECS Dept., Univ. of California, Berkeley, Berkeley, CA, USA, Mar. 2014.

[24]

W. Li, L. Dworkin, and S. A. Seshia, “Mining assumptions for synthesis,” in Proc. ACM/IEEE Int. Conf. Formal Methods Models for Codes., 2011, pp. 43–50.

Digital Library

[25]

D. Lo and S.-C. Khoo, “ QUARK: Empirical assessment of automaton-based specification miners,” in Proc. Working Conf. Reverse Eng., 2006, pp. 51–60.

[26]

D. Lo and S.-C. Khoo, “ SMArTIC: Towards building an accurate, robust and scalable specification miner,” in Proc. ACM SIGSOFT Int. Symp. Found. Softw. Eng., 2006, pp. 265– 275.

[27]

D. Lo and S. Maoz, “ Scenario-based and value-based specification mining: Better together,” in Proc. IEEE/ACM Int. Conf. Automated Softw. Eng., 2010, pp. 387–396.

Digital Library

[28]

D. Lo, L. Mariani and M. Pezzè, “Automatic steering of behavioral model inference,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Found. Softw. Eng., 2009, pp. 345–354.

[29]

D. Lorenzoli, L. Mariani, and M. Pezzè, “ Automatic generation of software behavioral models,” in Proc. ACM/IEEE Int. Conf. Softw. Eng., 2008, pp. 501–510.

[30]

L. Mariani, A. Marchetto, C. D. Nguyen, P. Tonella, and A. Baars, “Revolution: Automatic evolution of mined specifications, ” in Proc. IEEE Int. Symp. Softw. Rel. Eng., 2012, pp. 241 –250.

[31]

L. Mariani and M. Pezzè, “Dynamic detection of COTS component incompatibility,” IEEE Softw., vol. 24, no. 5, pp. 76–85, Sep./Oct. 2007.

Digital Library

[32]

A. Møller. (2010). dk.brics.automaton—Finite-state automata and regular expressions for Java. [Online]. Available: http://www.brics.dk/automaton/

[33]

C. D. Nguyen, A. Marchetto, and P. Tonella, “Automated oracles: An empirical study on cost and effectiveness,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Found. Softw. Eng., 2013, pp. 136– 146.

[34]

T. Ohmann, M. Herzberg, S. Fiss, A. Halbert, M. Palyart, I. Beschastnikh, and Y. Brun, “ Behavioral resource-aware model inference,” in Proc. IEEE/ACM Int. Conf. Automated Softw. Eng., 2014, pp. 19–30.

[35]

A. Pnueli, “The temporal logic of programs,” in Proc. Symp. Found. Comput. Sci., 1977, pp. 46–57.

[36]

M. Pradel, P. Bichsel, and T. R. Gross, “A framework for the evaluation of specification miners based on finite state machines,” in Proc. IEEE Int. Conf. Softw. Maintenance, 2010, pp. 1–10.

[37]

A. V. Raman and J. D. Patrick, “The sk-strings method for inferring PFSA,” in Proc. Workshop Automata Induction, Grammatical Inference, Lang. Acquisition, 1997, http://www.cs.iastate.edu/~honavar/mlworkshop.html#papers

[38]

S. P. Reiss and M. Renieris, “Encoding program executions,” in Proc. ACM/IEEE Int. Conf. Softw. Eng., 2001, pp. 221–230.

[39]

N. Walkinshaw and K. Bogdanov, “Inferring finite-state models with temporal constraints,” in Proc. IEEE/ACM Int. Conf. Autom. Softw. Eng., 2008, pp. 248–257.

[40]

W. Xu, L. Huang, A. Fox, D. Patterson, and M. Jordan, “Experience mining Google’s production console logs,” in Proc. Workshop Manag. Syst. Log Anal. Mach. Learn. Techn., 2010, pp. 5:1 –5:9.

[41]

J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das, “Perracotta: Mining temporal API rules from imperfect traces, ” in Proc. ACM/IEEE Int. Conf. Softw. Eng., 2006, pp. 282 –291.

[42]

T. Ziadi, M. A. A. da Silva, L. M. Hillah, and M. Ziane, “A fully dynamic approach to the reverse engineering of UML sequence diagrams,” in Proc. IEEE Int. Conf. Eng. Complex Comput. Syst., 2011, pp. 107–116.

Cited By

Mora FDesai APolgreen ESeshia S(2023)Message Chains for Distributed System VerificationProceedings of the ACM on Programming Languages10.1145/36228767:OOPSLA2(2224-2250)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622876
Brun YLin TSomerville JMyers EEbner N(2023)Blindspots in Python and Java APIs Result in Vulnerable CodeACM Transactions on Software Engineering and Methodology10.1145/357185032:3(1-31)Online publication date: 26-Apr-2023
https://dl.acm.org/doi/10.1145/3571850
Nguyen HPhan HKhairunnesa SNguyen SYadavally AWang SRajan HNguyen T(2022)A Hybrid Approach for Inference between Behavioral Exception API Documentation and Implementations, and Its ApplicationsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560434(1-13)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3560434
Show More Cited By

Index Terms

Using Declarative Specification to Improve the Understanding, Extensibility, and Comparison of Model-Inference Algorithms

Index terms have been assigned to the content through auto-classification.

Recommendations

Layout-sensitive language extensibility with SugarHaskell
Haskell '12: Proceedings of the 2012 Haskell Symposium

Programmers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more ...
Layout-sensitive language extensibility with SugarHaskell
Haskell '12

Programmers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more ...
Unifying FSM-inference algorithms through declarative specification
ICSE '13: Proceedings of the 2013 International Conference on Software Engineering

Logging system behavior is a staple development practice. Numerous powerful model inference algorithms have been proposed to aid developers in log analysis and system understanding. Unfortunately, existing algorithms are difficult to understand, extend,...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering

IEEE Transactions on Software Engineering Volume 41, Issue 4

April 2015

98 pages

ISSN:0098-5589

Issue’s Table of Contents

Copyright © 2014.

Publisher

IEEE Press

Publication History

Published: 01 April 2015

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mora FDesai APolgreen ESeshia S(2023)Message Chains for Distributed System VerificationProceedings of the ACM on Programming Languages10.1145/36228767:OOPSLA2(2224-2250)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3622876
Brun YLin TSomerville JMyers EEbner N(2023)Blindspots in Python and Java APIs Result in Vulnerable CodeACM Transactions on Software Engineering and Methodology10.1145/357185032:3(1-31)Online publication date: 26-Apr-2023
https://dl.acm.org/doi/10.1145/3571850
Nguyen HPhan HKhairunnesa SNguyen SYadavally AWang SRajan HNguyen T(2022)A Hybrid Approach for Inference between Behavioral Exception API Documentation and Implementations, and Its ApplicationsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560434(1-13)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3560434
Gerten MMarsh ALathrop JCohen MMiner AKlinge TDwyer MDamian DZeller A(2022)Inference and test generation using program invariants in chemical reaction networksProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510176(1193-1205)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510176
Shin DBianculli DBriand L(2022)PRINS: scalable model inference for component-based system logsEmpirical Software Engineering10.1007/s10664-021-10111-427:4Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1007/s10664-021-10111-4
Kang HLo D(2021)Adversarial Specification MiningACM Transactions on Software Engineering and Methodology10.1145/342430730:2(1-40)Online publication date: 3-Jan-2021
https://dl.acm.org/doi/10.1145/3424307
Johnson BBrun YMeliou ARothermel GBae D(2020)Causal testingProceedings of the ACM/IEEE 42nd International Conference on Software Engineering10.1145/3377811.3380377(87-99)Online publication date: 27-Jun-2020
https://dl.acm.org/doi/10.1145/3377811.3380377
Fu ZGuo CZhang ZRen SSha L(2020)UACFinderACM Transactions on Cyber-Physical Systems10.1145/33754054:3(1-25)Online publication date: 12-Mar-2020
https://dl.acm.org/doi/10.1145/3375405
Alimadadi SMesbah APattabiraman KChaudron MCrnkovic IChechik MHarman M(2018)Inferring hierarchical motifs from execution tracesProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180216(776-787)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3180155.3180216
Schnepf NBadonnel RLahmadi AMerz S(2018)Generation of SDN policies for protecting android environments based on automata learningNOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS.2018.8406153(1-7)Online publication date: 23-Apr-2018
https://dl.acm.org/doi/10.1109/NOMS.2018.8406153
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents