Nothing Special   »   [go: up one dir, main page]

Skip to main content

Spatiotemporal Abstraction of Stochastic Sequential Processes

  • Conference paper
  • First Online:
Abstraction, Reformulation, and Approximation (SARA 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2371))

  • 868 Accesses

Abstract

Probabilistic finite state machines have become a popular modeling tool for representing sequential processes, ranging from images and speech signals to text documents and spatial and genomic maps. In this paper, I describe two hierarchical abstraction mechanisms for simplifying the (estimation) learning and (control) optimization of complex Markov processes: spatial decomposition and temporal aggregation. I present several approaches to combining spatial and temporal abstraction, drawing upon recent work of my group as well as that of others. I show how spatiotemporal abstraction enables improved solutions to three difficult sequential estimation and decision problems: hidden state modeling and control, learning parallel plans, and coordinating with multiple agents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1–2):49–107, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  2. R.H. Crites and A.G. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33:235–262, 1998.

    Article  MATH  Google Scholar 

  3. T. Dean and R. Givan. Model minimization in markov decision processes. Proceedings of AAAI, 1997.

    Google Scholar 

  4. T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Computational Intelligence, 5(3):142–150, 1989.

    Article  Google Scholar 

  5. T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. International Journal of Artificial Intelligence Research, 13:227–303, 2000.

    MATH  MathSciNet  Google Scholar 

  6. S. Fine, Y. Singer, and N. Tishby. The Hierarchical Hidden Markov Model: Analysis and Applications. Machine Learning, 32(1), July 1998.

    Google Scholar 

  7. Dayne Freitag and Andrew Kachites McCallum. Information extraction with hmms and shrinkage. In Proceedings of the AAAI-99 Workshop on Machine Learning for Informatino Extraction, 1999.

    Google Scholar 

  8. M. Ghavamzadeh and S. Mahadevan. Continuous-time hierarchical reinforcement learning. In Proceedings of the Eighteenth International Conference on Machine Learning, 2001.

    Google Scholar 

  9. N. Hernandez and S. Mahadevan. Hierarchical memory-based reinforcement learning. Proceedings of Neural Information Processing Systems, 2001.

    Google Scholar 

  10. F. Jellinek. Statistical Methods in Speech Recognition. MIT Press, 2000.

    Google Scholar 

  11. K. Karplus, C. Barrett, and R. Hughey. Hidden markov models for detecting remote protein homologies, 1998.

    Google Scholar 

  12. Craig A. Knoblock. An analysis of ABSTRIPS. In James Hendler, editor, Artificial Intelligence Planning Systems: Proceedings of the First International Conference (AIPS 92), pages 126–135, College Park, Maryland, USA, 1992. Morgan Kaufmann.

    Google Scholar 

  13. S. Koenig and R. Simmons. Xavier: A robot navigation architecture based on partially observable markov decision process models. In D. Kortenkamp, P. Bonasso, and Murphy. R., editors, AI-based Mobile Robots: Case-studies of Successful Robot Systems. MIT Press, 1997.

    Google Scholar 

  14. Daphne Koller and Ronald Parr. Computing factored value functions for policies in structured mdps. 16th International Joint Conference on Artificial Intelligence (IJCAI), pages 1332–1339, 1999.

    Google Scholar 

  15. M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157–163, 1994.

    Google Scholar 

  16. S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311–365, 1992. Appeared originally as IBM TR RC16359, Dec 1990.

    Google Scholar 

  17. S. Mahadevan, N. Marchalleck, T. Das, and A. Gosavi. Self-improving factory simulation using continuous-time average-reward reinforcement learning. In Proc. 14th International Conference on Machine Learning, pages 202–210. Morgan Kaufmann, 1997.

    Google Scholar 

  18. Andrew K. McCallum. Reinforcement Learning with Selective Perception and Hidden State. PhD thesis, University of Rochester, 1995.

    Google Scholar 

  19. N. Meuleau, M. Hauskrecht, K. Kim, L. Peshkin, L. Kaelbling, T. Dean, and C. Boutilier. Solving very large weakly coupled markov decision processes. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 1998.

    Google Scholar 

  20. S. Minut and S. Mahadevan. A reinforcement learning model of selective visual attention. In Fifth International Conference on Autonomous Agents, 2001.

    Google Scholar 

  21. K. Murphy and M. Paskin. Linear time inference in hierarchical hmms. Proceedings of Neural Information Processing Systems, 2001.

    Google Scholar 

  22. I. Nourbakhsh, R. Powers, and S. Birchfield. Dervish: An office-navigation robot. AI Magazine, 16(2):53–60, 1995.

    Google Scholar 

  23. R.E. Parr. Hierarchical Control and Learning for Markov Decision Processes. PhD Thesis, University of California, Berkeley, 1998.

    Google Scholar 

  24. J. Pineau, N. Roy, and S. Thrun. A hierarchical approach to POMDP planning and execution. In Workshop on Hierarchy and Memory in Reinforcement Learning (ICML 2001), Williams College, MA, June 2001.

    Google Scholar 

  25. A. Prieditis. Machine discovery of admissible heuristics, 1993.

    Google Scholar 

  26. M. L. Puterman. Markov Decision Processes. Wiley Interscience, New York, USA, 1994.

    MATH  Google Scholar 

  27. B. Ravi and A. Barto. Model minimization in hierarchical reinforcement learning. In Symposium on Abstraction and Reformulation (SARA 2002). Springer Verlag, 2002.

    Google Scholar 

  28. K. Rohanimanesh and S. Mahadevan. Decision-theoretic planning with concurrent temporally extended actions. In 17th Conference on Uncertainty in Artificial Intelligence, 2001.

    Google Scholar 

  29. K. Rohanimanesh and S. Mahadevan. Incremental learning of factorial markov decision processes. under preparation, 2002.

    Google Scholar 

  30. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1994.

    Google Scholar 

  31. Hagit Shatkay and Leslie Pack Kaelbling. Learning topological maps with weak local odometric information. In IJCAI (2), pages 920–929, 1997.

    Google Scholar 

  32. T. Sugawara and V. Lesser. Learning to improve coordinated actions in cooperative distributed problem-solving environments. Machine Learning, 33:129–154, 1998.

    Article  Google Scholar 

  33. R. Sutton and A. Barto. An introduction to reinforcement learning. MIT Press, Cambridge, MA., 1998.

    Google Scholar 

  34. R. Sutton, D. Precup, and S. Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181–211, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  35. M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pages 330–337, 1993.

    Google Scholar 

  36. Gerald Tesauro. Practical issues in temporal difference learning. Machine Learning, 8:257–278, 1992.

    MATH  Google Scholar 

  37. G. Theocharous. Hierarchical Learning and Planning in Partially Observable Markov Decision Processes. PhD Thesis, Michigan State University, 2002.

    Google Scholar 

  38. G. Theocharous and S. Mahadevan. Approximate planning with hierarchical partially observable markov decision processs for robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2002.

    Google Scholar 

  39. G. Theocharous, K. Rohanimanesh, and S. Mahadevan. Learning hierarchical partially observable markov decision processs for robot navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2001.

    Google Scholar 

  40. G. Wang and S. Mahadevan. Hierarchical optimization of policy-coupled semi-Markov decision processes. In Proc. 16th International Conf. on Machine Learning, pages 464–473. Morgan Kaufmann, San Francisco, CA, 1999.

    Google Scholar 

  41. C. Watkins. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, England, 1989.

    Google Scholar 

  42. G. Weiss. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge, MA., 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahadevan, S. (2002). Spatiotemporal Abstraction of Stochastic Sequential Processes. In: Koenig, S., Holte, R.C. (eds) Abstraction, Reformulation, and Approximation. SARA 2002. Lecture Notes in Computer Science(), vol 2371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45622-8_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-45622-8_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43941-7

  • Online ISBN: 978-3-540-45622-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics