article

Free access

Large-Sample Learning of Bayesian Networks is NP-Hard

Authors:

David Maxwell Chickering,

David Heckerman,

Christopher MeekAuthors Info & Claims

The Journal of Machine Learning Research, Volume 5

Pages 1287 - 1330

Published: 01 December 2004 Publication History

Abstract

In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest structure for which the model is able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is NP-hard, even when any combination of one or more of the following hold: the generative distribution is perfect with respect to some DAG containing hidden variables; we are given an independence oracle; we are given an inference oracle; we are given an information oracle; we restrict potential solutions to structures in which each node has at most k parents, for all k>=3.Our proof relies on a new technical result that we establish in the appendices. In particular, we provide a method for constructing the local distributions in a Bayesian network such that the resulting joint distribution is provably perfect with respect to the structure of the network.

References

[1]

Bouckaert, R. R. (1995). Bayesian Belief Networks: From Construction to Inference. PhD thesis, Utrecht University, The Netherlands.

[2]

Chickering, D. M. (1995). A transformational characterization of Bayesian network structures. In Hanks, S. and Besnard, P., editors, Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, QU, pages 87-98. Morgan Kaufmann.

Digital Library

[3]

Chickering, D. M. (1996). Learning Bayesian networks is NP-Complete. In Fisher, D. and Lenz, H., editors, Learning from Data: Artificial Intelligence and Statistics V, pages 121-130. Springer-Verlag.

[4]

Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3:507-554.

Digital Library

[5]

Dasgupta, S. (1999). Learning polytrees. In Laskey, K. and Prade, H., editors, Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, pages 131- 141. Morgan Kaufmann.

Digital Library

[6]

Druzdzel, M. J. and Henrion, M. (1993). Efficient reasoning in qualitative probabilistic networks. In Proceedings of the Eleventh Annual Conference on Artificial Intelligence (AAAI-93), Washington, D.C., pages 548-553.

Digital Library

[7]

Garey, M. and Johnson, D. (1979). Computers and intractability: A guide to the theory of NP-completeness . W.H. Freeman.

Digital Library

[8]

Gavril, F. (1977). Some NP-complete problems on graphs. In Proc. 11th Conf. on Information Sciences and Systems, Johns Hopkins University, pages 91-95. Baltimore, MD.

[9]

Howard, R. and Matheson, J. (1981). Influence diagrams. In Readings on the Principles and Applications of Decision Analysis, volume II, pages 721-762. Strategic Decisions Group, Menlo Park, CA.

[10]

Karlin, S. and Rinott, Y. (1980). Classes of orderings of measures and related correlation inequalities. i. multivariate totally positive distributions. Journal of Multivariate Analysis, 10:467-498.

[11]

Meek, C. (1997). Graphical Models: Selecting causal and statistical models. PhD thesis, Carnegie Mellon University.

[12]

Meek, C. (2001). Finding a path is harder than finding a tree. Journal of Artificial Intelligence Research, 15:383-389.

Digital Library

[13]

Nielsen, J. D., Kočka, T., and Peña, J. M. (2003). On local optima in learning Bayesian networks. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, Acapulco, Mexico, pages 435-442. Morgan Kaufmann.

Digital Library

[14]

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA.

Digital Library

[15]

Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search (second edition). The MIT Press, Cambridge, Massachussets.

[16]

Srebro, N. (2001). Maximum likelihood bounded tree-width Markov networks. In Breese, J. and Koller, D., editors, Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence , Seattle, WA, pages 504-511. Morgan Kaufmann.

Digital Library

[17]

Wellman, M. P. (1990). Fundamental concepts of qualitative probabilistic networks. Artificial Intelligence, 44:257-303.

Digital Library

Cited By

Laborda JTorrijos PPuerta JGámez J(2024)Parallel structural learning of Bayesian networksKnowledge-Based Systems10.1016/j.knosys.2024.111840296:COnline publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111840
Wang CGao XLi XLi BWan K(2024)Finding community structure in Bayesian networks by heuristic K-standard deviation methodFuture Generation Computer Systems10.1016/j.future.2024.03.047158:C(556-568)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.future.2024.03.047
Li YZiebart BOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distributionally robust skeleton learning of discrete Bayesian networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668887(63343-63371)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668887
Show More Cited By

Large-Sample Learning of Bayesian Networks is NP-Hard
1. Computing methodologies
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory

Recommendations

Large-sample learning of bayesian networks is NP-hard
UAI'03: Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

In this paper, we provide new complexity results for algorithms that learn discretevariable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest model able to represent the ...
Tight lower bounds for certain parameterized NP-hard problems

Based on the framework of parameterized complexity theory, we derive tight lower bounds on the computational complexity for a number of well-known NP-hard problems. We start by proving a general result, namely that the parameterized weighted ...
Clique-width minimization is NP-hard
STOC '06: Proceedings of the thirty-eighth annual ACM symposium on Theory of Computing

Clique-width is a graph parameter that measures in a certain sense the complexity of a graph. Hard graph problems (e.g., problems expressible in Monadic Second Order Logic with second-order quantification on vertex sets, that includes NP-hard problems) ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 5, Issue

12/1/2004

1571 pages

ISSN:1532-4435

EISSN:1533-7928

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 December 2004

Published in JMLR Volume 5

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

124
Total Citations
View Citations
802
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)16

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Laborda JTorrijos PPuerta JGámez J(2024)Parallel structural learning of Bayesian networksKnowledge-Based Systems10.1016/j.knosys.2024.111840296:COnline publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111840
Wang CGao XLi XLi BWan K(2024)Finding community structure in Bayesian networks by heuristic K-standard deviation methodFuture Generation Computer Systems10.1016/j.future.2024.03.047158:C(556-568)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.future.2024.03.047
Li YZiebart BOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Distributionally robust skeleton learning of discrete Bayesian networksProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668887(63343-63371)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668887
Chen TBello KAragam BRavikumar POh ANaumann TGloberson ASaenko KHardt MLevine S(2023)iSCANProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668057(44671-44706)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668057
Deng CBello KRavikumar PAragam BOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Global optimality in bivariate gradient-based DAG learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666910(17929-17968)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666910
Misiakos PWendler CPüschel MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Learning DAGs from data with few root causesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666860(16865-16888)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666860
Quinzan FSoleymani AJaillet PRojas CBauer SKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)DRCFSProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619589(28468-28491)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619589
Qiao JCai RWu SXiang YZhang KHao ZElkind E(2023)Structural Hawkes processes for learning causal structure from discrete-time event sequencesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/633(5702-5710)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/633
Sun BDong LLiu PDing YChen S(2023)A Breast Cancer Detection Method Based on Bayesian NetworksProceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering10.1145/3652628.3652783(933-937)Online publication date: 17-Nov-2023
https://dl.acm.org/doi/10.1145/3652628.3652783
Yu KLing ZLiu LLi PWang HLi J(2023)Feature Selection for Efficient Local-to-global Bayesian Network Structure LearningACM Transactions on Knowledge Discovery from Data10.1145/362447918:2(1-27)Online publication date: 19-Sep-2023
https://dl.acm.org/doi/10.1145/3624479
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents