Abstract
We propose a general definition of composition operator on Markov Decision Processes with rewards (MDPs) and identify a well behaved class of operators, called safe, that are guaranteed to be non-extensive w.r.t. the bisimilarity pseudometrics of Ferns et al. [10], which measure behavioral similarities between MDPs. For MDPs built using safe/non-extensive operators, we present the first method that exploits the structure of the system for (exactly) computing the bisimilarity distance on MDPs. Experimental results show significant improvements upon the non-compositional technique.
Work supported by the VKR Center of Excellence MT-LAB and by the Sino-Danish Basic Research Center IDEA4CPS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bacci, G., Bacci, G., Larsen, K.G., Mardare, R.: On-the-Fly Exact Computation of Bisimilarity Distances. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 1–15. Springer, Heidelberg (2013)
Castro, P.S., Precup, D.: Using bisimulation for policy transfer in MDPs. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2010, Richland, SC, vol. 1, pp. 1399–1400. International Foundation for Autonomous Agents and Multiagent Systems (2010)
Castro, P.S., Precup, D.: Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics. In: Sanner, S., Hutter, M. (eds.) EWRL 2011. LNCS, vol. 7188, pp. 140–152. Springer, Heidelberg (2012)
Chatterjee, K., de Alfaro, L., Majumdar, R., Raman, V.: Algorithms for Game Metrics. Logical Methods in Computer Science 6(3) (2010)
Chen, D., van Breugel, F., Worrell, J.: On the Complexity of Computing Probabilistic Bisimilarity. In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213, pp. 437–451. Springer, Heidelberg (2012)
Comanici, G., Panangaden, P., Precup, D.: On-the-Fly Algorithms for Bisimulation Metrics. In: International Conference on Quantitative Evaluation of Systems, pp. 94–103 (2012)
Comanici, G., Precup, D.: Basis function discovery using spectral clustering and bisimulation metrics. In: AAMAS 2011, Richland, SC, vol. 3, pp. 1079–1080. International Foundation for Autonomous Agents and Multiagent Systems (2011)
Dantzig, G.B.: Application of the Simplex method to a transportation problem. In: Koopmans, T. (ed.) Activity Analysis of Production and Allocation, pp. 359–373. J. Wiley, New York (1951)
Desharnais, J., Gupta, V., Jagadeesan, R., Panangaden, P.: Metrics for labelled Markov processes. Theoretical Computer Science 318(3), 323–354 (2004)
Ferns, N., Panangaden, P., Precup, D.: Metrics for finite Markov Decision Processes. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI, pp. 162–169. AUAI Press (2004)
Giacalone, A., Jou, C., Smolka, S.A.: Algebraic reasoning for probabilistic concurrent systems. In: Proc. IFIP TC2 Working Conference on Programming Concepts and Methods, pp. 443–458. North-Holland (1990)
Givan, R., Dean, T., Greig, M.: Equivalence notions and model minimization in Markov decision processes. Artificial Intelligence 147(1-2), 163–223 (2003)
Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Information and Computation 94(1), 1–28 (1991)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. John Wiley & Sons, Inc., New York (1994)
Thorsley, D., Klavins, E.: Approximating stochastic biochemical processes with Wasserstein pseudometrics. IET Systems Biology 4(3), 193–211 (2010)
van Breugel, F., Sharma, B., Worrell, J.: Approximating a Behavioural Pseudometric without Discount for Probabilistic Systems. Logical Methods in Computer Science 4(2), 1–23 (2008)
van Breugel, F., Worrell, J.: Approximating and computing behavioural distances in probabilistic transition systems. Theoretical Computer Science 360(1-3), 373–385 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bacci, G., Bacci, G., Larsen, K.G., Mardare, R. (2013). Computing Behavioral Distances, Compositionally. In: Chatterjee, K., Sgall, J. (eds) Mathematical Foundations of Computer Science 2013. MFCS 2013. Lecture Notes in Computer Science, vol 8087. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40313-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-40313-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40312-5
Online ISBN: 978-3-642-40313-2
eBook Packages: Computer ScienceComputer Science (R0)