Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3020299.3020305guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Bayesian network parameter learning using EM with parameter sharing

Published: 27 July 2014 Publication History

Abstract

This paper explores the effects of parameter sharing on Bayesian network (BN) parameter learning when there is incomplete data. Using the Expectation Maximization (EM) algorithm, we investigate how varying degrees of parameter sharing, varying number of hidden nodes, and different dataset sizes impact EM performance. The specific metrics of EM performance examined are: likelihood, error, and the number of iterations required for convergence. These metrics are important in a number of applications, and we emphasize learning of BNs for diagnosis of electrical power systems. One main point, which we investigate both analytically and empirically, is how parameter sharing impacts the error associated with EM's parameter estimates.

References

[1]
E.E. Altendorf, A.C. Restificar, and T.G. Dietterich. Learning from sparse data by exploiting monotonicity constraints. In Proceedings of UAI, volume 5, 2005.
[2]
A. Basak, I. Brinster, X. Ma, and O. J. Mengshoel. Accelerating Bayesian network parameter learning using Hadoop and MapReduce. In Proc. of BigMine-12, Beijing, China, August 2012.
[3]
P.S. Bradley, U. Fayyad, and C. Reina. Scaling EM (Expectation-Maximization) clustering to large databases. Microsoft Research Report, MSR-TR-98-35, 1998.
[4]
M. Chavira and A. Darwiche. Compiling Bayesian networks with local structure. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), pages 1306-1312, 2005.
[5]
A. Darwiche. A differential approach to inference in Bayesian networks. Journal of the ACM, 50(3):280-305, 2003.
[6]
C.P. de Campos and Q. Ji. Improving Bayesian network parameter learning using constraints. In Proc. 19th International Conference on Pattern Recognition (ICPR), pages 1-4. IEEE, 2008.
[7]
B. Delyon, M. Lavielle, and E. Moulines. Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, (27):94-128, 1999.
[8]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal Of The Royal Statistical Society, Series B, 39(1):1-38, 1977.
[9]
R. Greiner, X. Su, B. Shen, and W. Zhou. Structural extension to logistic regression: Discriminative parameter learning of belief net classifiers. Machine Learning, 59(3):297-322, 2005.
[10]
S.H. Jacobson and E. Yucesan. Global optimization performance measures for generalized hill climbing algorithms. Journal of Global Optimization, 29(2):173-190, 2004.
[11]
W. Jank. The EM algorithm, its randomized implementation and global optimization: Some challenges and opportunities for operations research. In F. B. Alt, M. C. Fu, and B. L. Golden, editors, Perspectives in Operations Research: Papers in Honor of Saul Gass 80th Birthday. Springer, 2006.
[12]
W. B. Knox and O. J. Mengshoel. Diagnosis and reconfiguration using Bayesian networks: An electrical power system case study. In Proc. of the IJCAI-09 Workshop on Self-* and Autonomous Systems (SAS): Reasoning and Integration Chal lenges, pages 67-74, 2009.
[13]
D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. The MIT Press, 2009.
[14]
H. Langseth and O. Bangsø. Parameter learning in object-oriented Bayesian networks. Annals of Mathematics and Artificial Intelligence, 32(1):221-243, 2001.
[15]
S.L. Lauritzen. The EM algorithm for graphical association models with missing data. Computational Statistics & Data Analysis, 19(2):191-201, 1995.
[16]
S.L. Lauritzen and D.J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society. Series B (Methodological), pages 157-224, 1988.
[17]
W. Liao and Q. Ji. Learning Bayesian network parameters under incomplete data with domain knowledge. Pattern Recognition, 42(11):3046-3056, 2009.
[18]
G.J. McLachlan and D. Peel. Finite mixture models. Wiley, 2000.
[19]
O. J. Mengshoel, M. Chavira, K. Cascio, S. Poll, A. Darwiche, and S. Uckun. Probabilistic model-based diagnosis: An electrical power system case study. IEEE Trans. on Systems, Man, and Cybernetics, 40(5):874-885, 2010.
[20]
O. J. Mengshoel, A. Darwiche, K. Cascio, M. Chavira, S. Poll, and S. Uckun. Diagnosing faults in electrical power systems of spacecraft and aircraft. In Proceedings of the Twentieth Innovative Applications of Artificial Intelligence Conference (IAAI-08), pages 1699-1705, Chicago, IL, 2008.
[21]
O. J. Mengshoel, S. Poll, and T. Kurtoglu. Developing large-scale Bayesian networks by composition: Fault diagnosis of electrical power systems in aircraft and spacecraft. In Proc. of the IJCAI-09 Workshop on Self-* and Autonomous Systems (SAS): Reasoning and Integration Challenges, pages 59-66, 2009.
[22]
O.J. Mengshoel, D.C. Wilkins, and D. Roth. Initialization and restart in stochastic local search: Computing a most probable explanation in Bayesian networks. IEEE Transactions on Knowledge and Data Engineering, 23(2):235-247, 2011.
[23]
J.M. Mooij. libDAI: A free and open source C++ library for discrete approximate inference in graphical models. Journal of Machine Learning Research, 11:2169-2173, August 2010.
[24]
S. Natarajan, P. Tadepalli, E. Altendorf, T.G. Dietterich, A. Fern, and A.C. Restificar. Learning first-order probabilistic models with combining rules. In ICML, pages 609-616, 2005.
[25]
R.S. Niculescu, T.M. Mitchell, and R.B. Rao. Bayesian network learning with parameter constraints. The Journal of Machine Learning Research, 7:1357-1383, 2006.
[26]
R.S. Niculescu, T.M. Mitchell, and R.B. Rao. A theoretical framework for learning Bayesian networks with parameter inequality constraints. In Proc. of the 20th International Joint Conference on Artifical Intelligence, pages 155-160, 2007.
[27]
S. Poll, A. Patterson-Hine, J. Camisa, D. Garcia, D. Hall, C. Lee, O.J. Mengshoel, C. Neukom, D. Nishikawa, J. Ossenfort, A. Sweet, S. Yentus, I. Roychoudhury, M. Daigle, G. Biswas, and X. Koutsoukos. Advanced diagnostics and prognostics testbed. In Proc. of the 18th International Workshop on Principles of Diagnosis (DX-07), pages 178-185, 2007.
[28]
M. Ramoni and P. Sebastiani. Robust learning with missing data. Machine Learning, 45(2):147-170, 2001.
[29]
E. Reed and O.J. Mengshoel. Scaling Bayesian network parameter learning with expectation maximization using MapReduce. Proc. of Big Learning Workshop on Neural Information Processing Systems (NIPS-12), 2012.
[30]
B. Ricks and O. J. Mengshoel. Diagnosis for uncertain, dynamic and hybrid domains using Bayesian networks and arithmetic circuits. International Journal of Approximate Reasoning, 55(5):1207-1234, 2014.
[31]
B. W. Ricks, C. Harrison, and O. J. Mengshoel. Integrating probabilistic reasoning and statistical quality control techniques for fault diagnosis in hybrid domains. In In Proc. of the Annual Conference of the Prognostics and Health Management Society 2011 (PHM-11), Montreal, Canada, 2011.
[32]
B. W. Ricks and O. J. Mengshoel. Methods for probabilistic fault diagnosis: An electrical power system case study. In Proc. of Annual Conference of the PHM Society, 2009 (PHM-09), San Diego, CA, 2009.
[33]
B. W. Ricks and O. J. Mengshoel. Diagnosing intermittent and persistent faults using static Bayesian networks. In Proc. of the 21st International Workshop on Principles of Diagnosis (DX-10), Portland, OR, 2010.
[34]
A. Saluja, P.K. Sundararajan, and O.J. Mengshoel. Age-Layered Expectation Maximization for parameter learning in Bayesian Networks. In Proceedings of Artificial Intelligence and Statistics (AIStats), La Palma, Canary Islands, 2012.
[35]
B. Thiesson, C. Meek, and D. Heckerman. Accelerating EM for large databases. Machine Learning, 45(3):279-299, 2001.
[36]
S. Watanabe. Algebraic analysis for nonidentifiable learning machines. Neural Computation, 13(4):899-933, 2001.
[37]
C.F. Wu. On the convergence properties of the EM algorithm. The Annals of Statistics, 11(1):95-103, 1983.
[38]
Z. Zhang, B.T. Dai, and A.K.H. Tung. Estimating local optimums in EM algorithm over Gaussian mixture model. In Proc. of the 25th international conference on Machine learning, pages 1240-1247. ACM, 2008.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
BMAW'14: Proceedings of the Eleventh UAI Conference on Bayesian Modeling Applications Workshop - Volume 1218
July 2014
101 pages
  • Editors:
  • Kathryn Blackmond Laskey,
  • Jim Jones,
  • Russell Almond

Publisher

CEUR-WS.org

Aachen, Germany

Publication History

Published: 27 July 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media