Abstract
This paper develops a free energy theory from physics including the variational principles for automata and languages and also provides algorithms to compute the energy as well as efficient algorithms for estimating the nondeterminism in a nondeterministic finite automaton. This theory is then used as a foundation to define a semantic similarity metric for automata and languages. Since automata are a fundamental model for all modern programs while languages are a fundamental model for the programs’ behaviors, we believe that the theory and the metric developed in this paper can be further used for real-word programs as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The factor 2, intuitively, comes from the fact that we “stretch”, by a factor of 2, a run in finite automaton M to correspond it to a walk in graph \({\hat{M}}\). A somewhat more efficient way to compute \({{{\mathcal {E}}}}(M_V)\) is to construct an \(m\times m\) Gurevich matrix \(\mathbf{M}'\) where m is the number of states in M such that \(\mathbf{M}'_{ij}=0\) if there is no transition from \(p_i\) to \(p_j\) in M, else \(\mathbf{M}'_{ij}=\sum _{a:(p_i,a,p_j)\in T} e^{V(p_i,a,p_j)}\). Herein, \(p_1,\cdots , p_m\) are all states in M. One can show that \({{{\mathcal {E}}}}(M_V)=\ln \lambda '\) where \(\lambda '\) is the Perron-Frobenius eigenvalue of \(\mathbf{M}'\). We omit the details.
References
Chartrand, G., Kubicki, G., Schultz, M.: Graph similarity, distance in graphs. Aequationes Math. 55(1), 129–145 (1998)
Chomsky, N., Miller, G.A.: Finite state languages. Inf. Control 1(2), 91–112 (1958)
Cui, C., Dang, Z., Fischer, T.R.: Typical paths of a graph. Fundam. Inform. 110( 1–4), 95–109 (2011)
Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Similarity in languages and programs. Theor. Comput. Sci. 498, 58–75 (2013)
Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Information rate of some classes of non-regular languages: an automata-theoretic approach. In: Csuhaj-Varjú, E., Dietzfelbinger, M., Ésik, Z. (eds.) MFCS 2014. LNCS, vol. 8634, pp. 232–243. Springer, Heidelberg (2014)
Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Execution information rate for some classes of automata. Inf. Comput. 246, 20–29 (2016)
Dang, Z., Dementyev, D., Fischer, T.R., Hutton III, W.J.: Security of numerical sensors in automata. In: Drewes, F. (ed.) CIAA 2015. LNCS, vol. 9223, pp. 76–88. Springer, Heidelberg (2015)
Dehmer, M., Emmert-Streib, F., Kilian, J.: A similarity measure for graphs with low computational complexity. Appl. Math. Comput. 182(1), 447–459 (2006)
Delvenne, J.-C., Libert, A.-S.: Centrality measures and thermodynamic formalism for complex networks. Phys. Rev. E 83, 046117 (2011)
ElGhawalby, H., Hancock, E.R.: Measuring graph similarity using spectral geometry. In: Campilho, A., Kamel, M.S. (eds.) ICIAR 2008. LNCS, vol. 5112, pp. 517–526. Springer, Heidelberg (2008)
Gurevich, B.M.: A variational characterization of one-dimensional countable state gibbs random fields. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 68(2), 205–242 (1984)
Harel, D.: Statecharts: a visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)
Ibarra, O.H., Cui, C., Dang, Z., Fischer, T.R.: Lossiness of communication channels modeled by transducers. In: Beckmann, A., Csuhaj-Varjú, E., Meer, K. (eds.) CiE 2014. LNCS, vol. 8493, pp. 224–233. Springer, Heidelberg (2014)
Koslicki, D.: Topological entropy of DNA sequences. Bioinformatics 27(8), 1061–1067 (2011)
Koslicki, D., Thompson, D.J.: Coding sequence density estimation via topological pressure. J. Math. Biol. 70(1), 45–69 (2014)
Li, Q., Dang, Z.: Sampling automata and programs. Theor. Comput. Sci. 577, 125–140 (2015)
Naval, S., Laxmi, V., Rajarajan, M., Gaur, M.S., Conti, M.: Employing program semantics for malware detection. IEEE Trans. Inf. Forensics Secur. 10(12), 2591–2604 (2015)
Ruelle, D.: Thermodynamic Formalism: The Mathematical Structure of Equilibrium Statistical Mechanics. Cambridge University Press/Cambridge Mathematical Library, Cambridge (2004)
Sarig, O.M.: Thermodynamic formalism for countable Markov shifts. Ergodic Theor. Dyn. Syst. 19, 1565–1593 (1999)
Sarig, O.M.: Lecture notes on thermodynamic formalism for topological Markov shifts (2009)
Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949)
Sokolsky, O., Kannan, S., Lee, I.: Simulation-based graph similarity. In: Hermanns, H., Palsberg, J. (eds.) TACAS 2006. LNCS, vol. 3920, pp. 426–440. Springer, Heidelberg (2006)
Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Mathematics. Springer, New York (1982)
Zager, L.A., Verghese, G.C.: Graph Similarity Scoring and Matching. Appl. Math. Lett. 21(1), 86–94 (2008)
Acknowledgements
We would like to thank Jean-Charles Delvenne, David Koslicki, Daniel J. Thompson, Eric Wang, William J. Hutton III, and Ali Saberi for discussions. We would also like to thank the seven referees for suggestions and comments that have improved the presentation of our results.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Cui, C., Dang, Z. (2016). A Free Energy Foundation of Semantic Similarity in Automata and Languages. In: Amsaleg, L., Houle, M., Schubert, E. (eds) Similarity Search and Applications. SISAP 2016. Lecture Notes in Computer Science(), vol 9939. Springer, Cham. https://doi.org/10.1007/978-3-319-46759-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-46759-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46758-0
Online ISBN: 978-3-319-46759-7
eBook Packages: Computer ScienceComputer Science (R0)