Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2073796.2073828guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article
Free access

SPUDD: stochastic planning using decision diagrams

Published: 30 July 1999 Publication History

Abstract

Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete state enumeration. We propose and examine a new value iteration algorithm for MDPs that uses algebraic decision diagrams (ADDs) to represent value functions and policies, assuming an ADD input representation of the MDP. Dynamic programming is implemented via ADD manipulation. We demonstrate our method on a class of large MDPs (up to 63 million states) and show that significant gains can be had when compared to tree-structured representations (with up to a thirty-fold reduction in the number of nodes required to represent optimal value functions).

References

[1]
R. Iris Bahar, E. A. Frohm, C. M. Gaona, G. D. Hachtel, E. Macii, A. Pardo, and F. Somenzi. Algebraic decision diagrams and their applications. Intl. Conf. Computer-Aided Design, 188-191, IEEE, 1993.
[2]
R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, 1957.
[3]
D. P. Bertsekas and D. A. Castanon. Adaptive aggregation for infinite horizon dynamic programming. IEEE Trans. Aut. Cont., 34589-598.1989,
[4]
C. Boutilier. Correlated action effects in decision theoretic regression. Proc. UAI-97, pp.30-37, Providence, RI, 1997.
[5]
C. Boutilier, T. Dean, and S. Hanks. Decision theoretic planning: Structural assumptions and computational leverage. J. Artif. Intel. Research, 1999. To appear.
[6]
C. Boutilier and R. Dearden. Approximating value trees in structured dynamic programming. Proc. Intl. Conf. Machine Learning, pp.54-62, Bari, Italy, 1996.
[7]
C. Boutilier, R. Dearden, and M. Goldszmidt. Exploiting structure in policy construction. Proc. IJCAI-95, pp.1104- 1111, Montreal, 1995.
[8]
C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. manuscript, 1999.
[9]
C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-specific independence in Bayesian networks. Proc. UAI-96, pp. 115-123, Portland, OR, 1996.
[10]
R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comp., C-35(8):677491, 1986.
[11]
A. Cimatti, M. Roveri, and P. Traverso. Automatic obdd-based generation of universal plans in non-deterministic domains. Proc. AAAI-98, pp.875-881, 1998.
[12]
T. Dean and R. Givan. Model minimization in Markov decision processes. Proc. AAAI-97, pp. 106-111, Providence, 1997.
[13]
T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Comp. Intel., 5(3):142-150, 1989.
[14]
R. Dearden and C. Boutilier. Abstraction and approximate decision theoretic planning. Artif. Intel., 89:219-283, 1997.
[15]
R. Dechter. Topological parameters for time-space tradeoff. Proc. UAI-96, pp.220-227, Portland, OR, 1996.
[16]
S. Hanks and D. V. McDermott. Modeling a dynamic and uncertain world i: Symbolic and probabilistic reasoning about change. Artif. Intil., 1994.
[17]
J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, 1988.
[18]
D. Poole. Exploiting the rule structure for decision making within the independent choice logic. Proc. UAI-95, pp.454- 463, Montreal, 1995.
[19]
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York, NY., 1994.
[20]
F. Somenzi. CUDD: CU decision diagram package. Available from ftp://vlsi.colorado.edu/pub/, 1998.

Cited By

View all
  • (2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 1-May-2021
  • (2020)Runtime Adaptation in Wireless Sensor Nodes Using Structured LearningACM Transactions on Cyber-Physical Systems10.1145/33721534:4(1-28)Online publication date: 6-Jul-2020
  • (2019)A Memory-Efficient Markov Decision Process Computation Framework Using BDD-based Sampling RepresentationProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317748(1-6)Online publication date: 2-Jun-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
UAI'99: Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
July 1999
703 pages
ISBN:1558606149
  • Editors:
  • Kathryn B. Laskey,
  • Henri Prade

Sponsors

  • Rockwell Science Center: Rockwell Science Center
  • HUGIN: Hugin Expert A/S
  • Information Extraction and Transportation
  • Microsoft Research: Microsoft Research
  • AT&T: AT&T Labs Research

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 30 July 1999

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)7
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 1-May-2021
  • (2020)Runtime Adaptation in Wireless Sensor Nodes Using Structured LearningACM Transactions on Cyber-Physical Systems10.1145/33721534:4(1-28)Online publication date: 6-Jul-2020
  • (2019)A Memory-Efficient Markov Decision Process Computation Framework Using BDD-based Sampling RepresentationProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317748(1-6)Online publication date: 2-Jun-2019
  • (2018)High-dimensional stochastic optimal control using continuous tensor decompositionsInternational Journal of Robotics Research10.1177/027836491775399437:2-3(340-377)Online publication date: 1-Feb-2018
  • (2018)Human-Agent Interaction Model Learning based on CrowdsourcingProceedings of the 6th International Conference on Human-Agent Interaction10.1145/3284432.3284471(20-28)Online publication date: 4-Dec-2018
  • (2018)Two Approximate Dynamic Programming Algorithms for Managing Complete SIS NetworksProceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies10.1145/3209811.3209814(1-10)Online publication date: 20-Jun-2018
  • (2017)The symbolic interior point methodProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298239.3298415(1199-1205)Online publication date: 4-Feb-2017
  • (2017)Logistic Markov decision processesProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172234(2486-2493)Online publication date: 19-Aug-2017
  • (2017)Exact quantitative probabilistic model checking through rational searchProceedings of the 17th Conference on Formal Methods in Computer-Aided Design10.5555/3168451.3168475(92-99)Online publication date: 2-Oct-2017
  • (2017)The MADP toolboxThe Journal of Machine Learning Research10.5555/3122009.317683318:1(3112-3116)Online publication date: 1-Jan-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media