Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3020548.3020588guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Sum-product networks: a new deep architecture

Published: 14 July 2011 Publication History

Abstract

The key limiting factor in graphical model inference and learning is the complexity of the partition function. We thus ask the question: what are general conditions under which the partition function is tractable? The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs). SPNs are directed acyclic graphs with variables as leaves, sums and products as internal nodes, and weighted edges. We show that if an SPN is complete and consistent it represents the partition function and all marginals of some graphical model, and give semantics to its nodes. Essentially all tractable graphical models can be cast as SPNs, but SPNs are also strictly more general. We then propose learning algorithms for SPNs, based on backpropagation and EM. Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks. For example, SPNs perform image completion better than state-of-the-art deep networks for this task. SPNs also have intriguing potential connections to the architecture of the cortex.

References

[1]
R. Adams, H. Wallach and Z. Ghahramani. Learning the structure of deep sparse graphical models. In Proc. AISTATS-10, 2010.
[2]
Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009.
[3]
M. Bertalmio, G. Sapiro, V. Caselles and C. Ballester. Image inpainting. In Proc. SIGGRAPH-00, 2000.
[4]
L. Breiman. Random forests. Machine Learning, 2001.
[5]
A. Chechetka and C. Guestrin. Efficient principled learning of thin junction trees. In Proc. NIPS-08, 2008.
[6]
M. Collins. Head-driven statistical models for natural language parsing. Computational Linguistics, 2003.
[7]
A. Darwiche. A differential approach to inference in Bayesian networks. Journal of the ACM, 2003.
[8]
R. Dechter and R. Mateescu. AND/OR search spaces for graphical models. Artificial Intelligence, 2006.
[9]
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples. In Proc. CVPR Wkshp. on Generative Model-Based Vision, 2004.
[10]
P. Felzenszwalb and D. McAllester. Object detection grammars. Tech. Rept., Dept. CS, Univ. Chicago, 2010.
[11]
J. H. Friedman. On bias, variance, 0/1 loss, and the curse of dimensionality. Data Mining and Knowledge Discovery, 1997.
[12]
V. Gogate, W. Webb and P. Domingos. Learning efficient Markov networks. In Proc. NIPS-10, 2010.
[13]
J. Hays and A. Efros. Scene completion using millions of photographs. In Proc. SIGGRAPH-07, 2007.
[14]
G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 2006.
[15]
Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989.
[16]
H. Lee, R. Grosse, R. Ranganath, and A. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proc. ICML-09, 2009.
[17]
D. Lowd and P. Domingos. Naive Bayes models for probability estimation. In Proc. ICML-05, 2005.
[18]
D. Lowd and P. Domingos. Learning arithmetic circuits. In Proc. UAI-08, 2008.
[19]
D. McAllester, M. Collins, and F. Pereira. Case-factor diagrams for structured probabilistic modeling. In Proc. UAI-04, 2004.
[20]
R. Neal and G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. Jordan, editor, Learning in Graphical Models, Kluwer, 1998.
[21]
G. Pagallo. Learning DNF by decision trees. In Proc. IJCAI-89, 1989.
[22]
J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.
[23]
M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex. Nature Neuroscience, 1999.
[24]
D. Roth and R. Samdani. Learning multi-linear representations of distributions for efficient inference. Machine Learning, 2009.
[25]
D. Rumelhart, G. Hinton, and J. McClelland. A general framework for parallel distributed processing. In D. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.
[26]
D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by error propagation. In D. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.
[27]
D. Rumelhart and D. Zipser. Feature discovery by competitive learning. In D. E. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.
[28]
F. Samaria and A. Harter. Parameterisation of a stochastic model for human face identification. In Proc. 2nd IEEE Wkshp. on Applications of Computer Vision, 1994.
[29]
R. Salakhutdinov and G. Hinton. Deep Boltzmann Machines. In Proc. AISTATS-09, 2009.
[30]
R. Salakhutdinov and G. Hinton. An efficient learning procedure for deep Boltzmann machines. Tech. Rept., MIT CSAIL, 2010.
[31]
M. Turk and A. Pentland. Eigenfaces for recognition. J. Cognitive Neuroscience, 1991.
[32]
N. Zhang. Hierarchical latent class models for cluster analysis. JMLR, 2004.
[33]
L. Zhu, Y. Chen, and A. Yuille. Unsupervised learning of probabilistic grammar-Markov models for object categories. IEEE Trans. PAMI, 2009.

Cited By

View all
  • (2024)Building expressive and tractable probabilistic generative modelsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/910(8234-8243)Online publication date: 3-Aug-2024
  • (2023)Probabilistic circuits that know what they don't knowProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626036(2157-2167)Online publication date: 31-Jul-2023
  • (2023)Knowledge intensive learning of cutset networksProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625964(1380-1389)Online publication date: 31-Jul-2023
  • Show More Cited By
  1. Sum-product networks: a new deep architecture

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    UAI'11: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence
    July 2011
    853 pages
    ISBN:9780974903972
    • Editors:
    • Fabio Cozman,
    • Avi Pfeffer

    Sponsors

    • Pascal Network of Excellence: Pascal Network of Excellence
    • Google Inc.
    • Artificial Intelligence Journal
    • IBMR: IBM Research
    • Microsoft Research: Microsoft Research

    Publisher

    AUAI Press

    Arlington, Virginia, United States

    Publication History

    Published: 14 July 2011

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Building expressive and tractable probabilistic generative modelsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/910(8234-8243)Online publication date: 3-Aug-2024
    • (2023)Probabilistic circuits that know what they don't knowProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626036(2157-2167)Online publication date: 31-Jul-2023
    • (2023)Knowledge intensive learning of cutset networksProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625964(1380-1389)Online publication date: 31-Jul-2023
    • (2021)FauceProceedings of the VLDB Endowment10.14778/3476249.347625414:11(1950-1963)Online publication date: 27-Oct-2021
    • (2021)Are we ready for learned cardinality estimation?Proceedings of the VLDB Endowment10.14778/3461535.346155214:9(1640-1654)Online publication date: 22-Oct-2021
    • (2020)Factor graph grammarsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496282(6648-6658)Online publication date: 6-Dec-2020
    • (2019)A Formal Proof of the Expressiveness of Deep LearningJournal of Automated Reasoning10.1007/s10817-018-9481-563:2(347-368)Online publication date: 1-Aug-2019
    • (2018)Deep homogeneous mixture modelsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327816(7136-7145)Online publication date: 3-Dec-2018
    • (2018)Online structure learning for feed-forward and recurrent sum-product networksProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327798(6944-6954)Online publication date: 3-Dec-2018
    • (2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media