Spatiotemporal Abstraction of Stochastic Sequential Processes

Sridhar Mahadevan³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2371))

Included in the following conference series:

International Symposium on Abstraction, Reformulation, and Approximation

868 Accesses

Abstract

Probabilistic finite state machines have become a popular modeling tool for representing sequential processes, ranging from images and speech signals to text documents and spatial and genomic maps. In this paper, I describe two hierarchical abstraction mechanisms for simplifying the (estimation) learning and (control) optimization of complex Markov processes: spatial decomposition and temporal aggregation. I present several approaches to combining spatial and temporal abstraction, drawing upon recent work of my group as well as that of others. I show how spatiotemporal abstraction enables improved solutions to three difficult sequential estimation and decision problems: hidden state modeling and control, learning parallel plans, and coordinating with multiple agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Imprecise Discrete-Time Markov Chains

Statistical Abstraction for Multi-scale Spatio-Temporal Systems

Reviewing Graphical Modelling of Multivariate Temporal Processes

References

C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1–2):49–107, 2000.
Article MATH MathSciNet Google Scholar
R.H. Crites and A.G. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33:235–262, 1998.
Article MATH Google Scholar
T. Dean and R. Givan. Model minimization in markov decision processes. Proceedings of AAAI, 1997.
Google Scholar
T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Computational Intelligence, 5(3):142–150, 1989.
Article Google Scholar
T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. International Journal of Artificial Intelligence Research, 13:227–303, 2000.
MATH MathSciNet Google Scholar
S. Fine, Y. Singer, and N. Tishby. The Hierarchical Hidden Markov Model: Analysis and Applications. Machine Learning, 32(1), July 1998.
Google Scholar
Dayne Freitag and Andrew Kachites McCallum. Information extraction with hmms and shrinkage. In Proceedings of the AAAI-99 Workshop on Machine Learning for Informatino Extraction, 1999.
Google Scholar
M. Ghavamzadeh and S. Mahadevan. Continuous-time hierarchical reinforcement learning. In Proceedings of the Eighteenth International Conference on Machine Learning, 2001.
Google Scholar
N. Hernandez and S. Mahadevan. Hierarchical memory-based reinforcement learning. Proceedings of Neural Information Processing Systems, 2001.
Google Scholar
F. Jellinek. Statistical Methods in Speech Recognition. MIT Press, 2000.
Google Scholar
K. Karplus, C. Barrett, and R. Hughey. Hidden markov models for detecting remote protein homologies, 1998.
Google Scholar
Craig A. Knoblock. An analysis of ABSTRIPS. In James Hendler, editor, Artificial Intelligence Planning Systems: Proceedings of the First International Conference (AIPS 92), pages 126–135, College Park, Maryland, USA, 1992. Morgan Kaufmann.
Google Scholar
S. Koenig and R. Simmons. Xavier: A robot navigation architecture based on partially observable markov decision process models. In D. Kortenkamp, P. Bonasso, and Murphy. R., editors, AI-based Mobile Robots: Case-studies of Successful Robot Systems. MIT Press, 1997.
Google Scholar
Daphne Koller and Ronald Parr. Computing factored value functions for policies in structured mdps. 16th International Joint Conference on Artificial Intelligence (IJCAI), pages 1332–1339, 1999.
Google Scholar
M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157–163, 1994.
Google Scholar
S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311–365, 1992. Appeared originally as IBM TR RC16359, Dec 1990.
Google Scholar
S. Mahadevan, N. Marchalleck, T. Das, and A. Gosavi. Self-improving factory simulation using continuous-time average-reward reinforcement learning. In Proc. 14th International Conference on Machine Learning, pages 202–210. Morgan Kaufmann, 1997.
Google Scholar
Andrew K. McCallum. Reinforcement Learning with Selective Perception and Hidden State. PhD thesis, University of Rochester, 1995.
Google Scholar
N. Meuleau, M. Hauskrecht, K. Kim, L. Peshkin, L. Kaelbling, T. Dean, and C. Boutilier. Solving very large weakly coupled markov decision processes. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 1998.
Google Scholar
S. Minut and S. Mahadevan. A reinforcement learning model of selective visual attention. In Fifth International Conference on Autonomous Agents, 2001.
Google Scholar
K. Murphy and M. Paskin. Linear time inference in hierarchical hmms. Proceedings of Neural Information Processing Systems, 2001.
Google Scholar
I. Nourbakhsh, R. Powers, and S. Birchfield. Dervish: An office-navigation robot. AI Magazine, 16(2):53–60, 1995.
Google Scholar
R.E. Parr. Hierarchical Control and Learning for Markov Decision Processes. PhD Thesis, University of California, Berkeley, 1998.
Google Scholar
J. Pineau, N. Roy, and S. Thrun. A hierarchical approach to POMDP planning and execution. In Workshop on Hierarchy and Memory in Reinforcement Learning (ICML 2001), Williams College, MA, June 2001.
Google Scholar
A. Prieditis. Machine discovery of admissible heuristics, 1993.
Google Scholar
M. L. Puterman. Markov Decision Processes. Wiley Interscience, New York, USA, 1994.
MATH Google Scholar
B. Ravi and A. Barto. Model minimization in hierarchical reinforcement learning. In Symposium on Abstraction and Reformulation (SARA 2002). Springer Verlag, 2002.
Google Scholar
K. Rohanimanesh and S. Mahadevan. Decision-theoretic planning with concurrent temporally extended actions. In 17th Conference on Uncertainty in Artificial Intelligence, 2001.
Google Scholar
K. Rohanimanesh and S. Mahadevan. Incremental learning of factorial markov decision processes. under preparation, 2002.
Google Scholar
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1994.
Google Scholar
Hagit Shatkay and Leslie Pack Kaelbling. Learning topological maps with weak local odometric information. In IJCAI (2), pages 920–929, 1997.
Google Scholar
T. Sugawara and V. Lesser. Learning to improve coordinated actions in cooperative distributed problem-solving environments. Machine Learning, 33:129–154, 1998.
Article Google Scholar
R. Sutton and A. Barto. An introduction to reinforcement learning. MIT Press, Cambridge, MA., 1998.
Google Scholar
R. Sutton, D. Precup, and S. Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181–211, 1999.
Article MATH MathSciNet Google Scholar
M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pages 330–337, 1993.
Google Scholar
Gerald Tesauro. Practical issues in temporal difference learning. Machine Learning, 8:257–278, 1992.
MATH Google Scholar
G. Theocharous. Hierarchical Learning and Planning in Partially Observable Markov Decision Processes. PhD Thesis, Michigan State University, 2002.
Google Scholar
G. Theocharous and S. Mahadevan. Approximate planning with hierarchical partially observable markov decision processs for robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2002.
Google Scholar
G. Theocharous, K. Rohanimanesh, and S. Mahadevan. Learning hierarchical partially observable markov decision processs for robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2001.
Google Scholar
G. Wang and S. Mahadevan. Hierarchical optimization of policy-coupled semi-Markov decision processes. In Proc. 16th International Conf. on Machine Learning, pages 464–473. Morgan Kaufmann, San Francisco, CA, 1999.
Google Scholar
C. Watkins. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, England, 1989.
Google Scholar
G. Weiss. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge, MA., 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Massachusetts, Amherst, MA, 01003, USA
Sridhar Mahadevan

Authors

Sridhar Mahadevan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 801 Atlantic Dr NW, Atlanta, GA, 30332-0280, USA
Sven Koenig
Department of Computing Science, Universtity of Alberta, 2-21 Athabasca Hall, Edmonton, Alberta, T6G 2E8, Canada
Robert C. Holte

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahadevan, S. (2002). Spatiotemporal Abstraction of Stochastic Sequential Processes. In: Koenig, S., Holte, R.C. (eds) Abstraction, Reformulation, and Approximation. SARA 2002. Lecture Notes in Computer Science(), vol 2371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45622-8_3

Download citation

DOI: https://doi.org/10.1007/3-540-45622-8_3
Published: 09 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43941-7
Online ISBN: 978-3-540-45622-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Spatiotemporal Abstraction of Stochastic Sequential Processes

Abstract

Access this chapter

Preview

Similar content being viewed by others

Imprecise Discrete-Time Markov Chains

Statistical Abstraction for Multi-scale Spatio-Temporal Systems

Reviewing Graphical Modelling of Multivariate Temporal Processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Spatiotemporal Abstraction of Stochastic Sequential Processes

Abstract

Access this chapter

Preview

Similar content being viewed by others

Imprecise Discrete-Time Markov Chains

Statistical Abstraction for Multi-scale Spatio-Temporal Systems

Reviewing Graphical Modelling of Multivariate Temporal Processes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation