Article

Sum-product networks: a new deep architecture

Authors:

Pedro DomingosAuthors Info & Claims

UAI'11: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence

Pages 337 - 346

Published: 14 July 2011 Publication History

Abstract

The key limiting factor in graphical model inference and learning is the complexity of the partition function. We thus ask the question: what are general conditions under which the partition function is tractable? The answer leads to a new kind of deep architecture, which we call sum-product networks (SPNs). SPNs are directed acyclic graphs with variables as leaves, sums and products as internal nodes, and weighted edges. We show that if an SPN is complete and consistent it represents the partition function and all marginals of some graphical model, and give semantics to its nodes. Essentially all tractable graphical models can be cast as SPNs, but SPNs are also strictly more general. We then propose learning algorithms for SPNs, based on backpropagation and EM. Experiments show that inference and learning with SPNs can be both faster and more accurate than with standard deep networks. For example, SPNs perform image completion better than state-of-the-art deep networks for this task. SPNs also have intriguing potential connections to the architecture of the cortex.

References

[1]

R. Adams, H. Wallach and Z. Ghahramani. Learning the structure of deep sparse graphical models. In Proc. AISTATS-10, 2010.

[2]

Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009.

Digital Library

[3]

M. Bertalmio, G. Sapiro, V. Caselles and C. Ballester. Image inpainting. In Proc. SIGGRAPH-00, 2000.

Digital Library

[4]

L. Breiman. Random forests. Machine Learning, 2001.

Digital Library

[5]

A. Chechetka and C. Guestrin. Efficient principled learning of thin junction trees. In Proc. NIPS-08, 2008.

Digital Library

[6]

M. Collins. Head-driven statistical models for natural language parsing. Computational Linguistics, 2003.

Digital Library

[7]

A. Darwiche. A differential approach to inference in Bayesian networks. Journal of the ACM, 2003.

Digital Library

[8]

R. Dechter and R. Mateescu. AND/OR search spaces for graphical models. Artificial Intelligence, 2006.

Digital Library

[9]

L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples. In Proc. CVPR Wkshp. on Generative Model-Based Vision, 2004.

Digital Library

[10]

P. Felzenszwalb and D. McAllester. Object detection grammars. Tech. Rept., Dept. CS, Univ. Chicago, 2010.

[11]

J. H. Friedman. On bias, variance, 0/1 loss, and the curse of dimensionality. Data Mining and Knowledge Discovery, 1997.

Digital Library

[12]

V. Gogate, W. Webb and P. Domingos. Learning efficient Markov networks. In Proc. NIPS-10, 2010.

Digital Library

[13]

J. Hays and A. Efros. Scene completion using millions of photographs. In Proc. SIGGRAPH-07, 2007.

Digital Library

[14]

G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 2006.

[15]

Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989.

Digital Library

[16]

H. Lee, R. Grosse, R. Ranganath, and A. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proc. ICML-09, 2009.

Digital Library

[17]

D. Lowd and P. Domingos. Naive Bayes models for probability estimation. In Proc. ICML-05, 2005.

Digital Library

[18]

D. Lowd and P. Domingos. Learning arithmetic circuits. In Proc. UAI-08, 2008.

[19]

D. McAllester, M. Collins, and F. Pereira. Case-factor diagrams for structured probabilistic modeling. In Proc. UAI-04, 2004.

Digital Library

[20]

R. Neal and G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. Jordan, editor, Learning in Graphical Models, Kluwer, 1998.

Digital Library

[21]

G. Pagallo. Learning DNF by decision trees. In Proc. IJCAI-89, 1989.

Digital Library

[22]

J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.

Digital Library

[23]

M. Riesenhuber and T. Poggio. Hierarchical models of object recognition in cortex. Nature Neuroscience, 1999.

[24]

D. Roth and R. Samdani. Learning multi-linear representations of distributions for efficient inference. Machine Learning, 2009.

Digital Library

[25]

D. Rumelhart, G. Hinton, and J. McClelland. A general framework for parallel distributed processing. In D. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.

Digital Library

[26]

D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by error propagation. In D. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.

Digital Library

[27]

D. Rumelhart and D. Zipser. Feature discovery by competitive learning. In D. E. Rumelhart and J. McClelland, editors, Parallel Distributed Processing, vol. 1. MIT Press, 1986.

Digital Library

[28]

F. Samaria and A. Harter. Parameterisation of a stochastic model for human face identification. In Proc. 2nd IEEE Wkshp. on Applications of Computer Vision, 1994.

[29]

R. Salakhutdinov and G. Hinton. Deep Boltzmann Machines. In Proc. AISTATS-09, 2009.

[30]

R. Salakhutdinov and G. Hinton. An efficient learning procedure for deep Boltzmann machines. Tech. Rept., MIT CSAIL, 2010.

[31]

M. Turk and A. Pentland. Eigenfaces for recognition. J. Cognitive Neuroscience, 1991.

Digital Library

[32]

N. Zhang. Hierarchical latent class models for cluster analysis. JMLR, 2004.

Digital Library

[33]

L. Zhu, Y. Chen, and A. Yuille. Unsupervised learning of probabilistic grammar-Markov models for object categories. IEEE Trans. PAMI, 2009.

Digital Library

Cited By

Sidheekh SNatarajan SLarson K(2024)Building expressive and tractable probabilistic generative modelsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/910(8234-8243)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/910
Ventola FBraun SYu ZMundt MKersting KEvans RShpitser I(2023)Probabilistic circuits that know what they don't knowProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626036(2157-2167)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3626036
Mathur SGogate VNatarajan SEvans RShpitser I(2023)Knowledge intensive learning of cutset networksProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625964(1380-1389)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625964
Show More Cited By

Sum-product networks: a new deep architecture
1. Computing methodologies

Recommendations

Collapsed variational inference for sum-product networks
ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48

Sum-Product Networks (SPNs) are probabilistic inference machines that admit exact inference in linear time in the size of the network. Existing parameter learning approaches for SPNs are largely based on the maximum likelihood principle and are subject ...
Interventional sum-product networks: causal inference with tractable probabilistic models
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

While probabilistic models are an important tool for studying causality, doing so suffers from the intractability of inference. As a step towards tractable causal models, we consider the problem of learning interventional distributions using sum-product ...
Discriminative learning of sum-product networks
NIPS'12: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2

Sum-product networks are a new deep architecture that can perform fast, exact inference on high-treewidth models. Only generative methods for training SPNs have been proposed to date. In this paper, we present the first discriminative training algorithms ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

UAI'11: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence

July 2011

853 pages

ISBN:9780974903972

Editors:
Fabio Cozman
Escola Politecnica, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
,
Avi Pfeffer
Department of Computer Science, Stanford University, Stanford, CA

Sponsors

Pascal Network of Excellence: Pascal Network of Excellence
Google Inc.
Artificial Intelligence Journal
IBMR: IBM Research
Microsoft Research: Microsoft Research

Publisher

AUAI Press

Arlington, Virginia, United States

Publication History

Published: 14 July 2011

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sidheekh SNatarajan SLarson K(2024)Building expressive and tractable probabilistic generative modelsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/910(8234-8243)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/910
Ventola FBraun SYu ZMundt MKersting KEvans RShpitser I(2023)Probabilistic circuits that know what they don't knowProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626036(2157-2167)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3626036
Mathur SGogate VNatarajan SEvans RShpitser I(2023)Knowledge intensive learning of cutset networksProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625964(1380-1389)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625964
Liu JDong WZhou QLi D(2021)FauceProceedings of the VLDB Endowment10.14778/3476249.347625414:11(1950-1963)Online publication date: 27-Oct-2021
https://dl.acm.org/doi/10.14778/3476249.3476254
Wang XQu CWu WWang JZhou Q(2021)Are we ready for learned cardinality estimation?Proceedings of the VLDB Endowment10.14778/3461535.346155214:9(1640-1654)Online publication date: 22-Oct-2021
https://dl.acm.org/doi/10.14778/3461535.3461552
Chiang DRiley DLarochelle HRanzato MHadsell RBalcan MLin H(2020)Factor graph grammarsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496282(6648-6658)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496282
Bentkamp ABlanchette JKlakow D(2019)A Formal Proof of the Expressiveness of Deep LearningJournal of Automated Reasoning10.1007/s10817-018-9481-563:2(347-368)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s10817-018-9481-5
Jaini PPoupart PYu Y(2018)Deep homogeneous mixture modelsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327816(7136-7145)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327757.3327816
Kalra ARashwan AHsu WPoupart PDoshi PTrimponias G(2018)Online structure learning for feed-forward and recurrent sum-product networksProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327798(6944-6954)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327757.3327798
Friesen ADomingos P(2018)Submodular field grammarsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327343(4312-4322)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327343
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents