Abstract
We propose a logic-based approach to variational Bayes (VB) via propositionalized probability computation in a symbolic-statistical modeling language PRISM. PRISM computes probabilities of logical formulas by reducing them to AND/OR boolean formulas called explanation graphs containing probabilistic \({\tt msw/2}\) atoms. We put Dirichlet priors on parameters of \({\tt msw/2}\) atoms and derive a variational Bayes EM algorithm that learns their hyper parameters from data. It runs on explanation graphs deduced from a program and a goal and computes probabilities in a dynamic programming manner in time linear in the size of the graphs. To show the genericity and effectiveness of Bayesian modeling by the proposed approach, we conducted two learning experiments, one with a probabilistic right-corner grammar and the other with a profile-HMM. To our knowledge, no previous report has been made of VB applied to these models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Attias, H.: Inferring parameters and structure of latent variable models by variational Bayes. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI-99), pp. 21–30, Stockholm, 30 July–1 August 1999
Baker, J.K.: Trainable grammars for speech recognition. In: Proceedings of Spring Conference of the Acoustical Society of America, pp. 547–550 (1979)
Bateman, A., Birney, E., Durbin, R., Eddy, S., Howe, K., Sonnhammer, E.: The Pfam protein families database. Nucleic Acids Res. 28(1), 263–266 (2000)
Beal, M., Ghahramani, Z.: The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. In: Bayesian Statistics, vol. 7, pp. 453–454. Oxford University Press, Oxford (2003)
Beal, M.J., Ghahramani, Z.: Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Anal. 1(4), 793–832 (2006)
Chavira, M., Darwiche, A.: Compiling Bayesian networks with local structure. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI’05), pp. 1306–1312, Edinburgh, 30 July–5 August 2005
Cheeseman, P., Stutz, J.: Bayesian classification (autoclass): theory and results. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI, Menlo Park (1996)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. R. Stat. Soc. B39(1), 1–38 (1977)
Ghahramani, Z., Beal, M.: Graphical models and variational methods. In: Saad, D., Opper, M. (eds.) Advanced Mean Field Methods—Theory and Practice. MIT, Cambridge (2001)
Goodman, J.: Parsing inside-out. Ph.D. dissertation, Harvard University (1998)
Kameya, Y., Sato, T.: Efficient EM learning for parameterized logic programs. In: Proceedings of the 1st Conference on Computational Logic (CL’00). Lecture Notes in Artificial Intelligence, vol. 1861, pp. 269–294. Springer, New York (2000)
Krogh, A., Brown, M., Mian, I., Sjolander, K., Haussler, D.: Hidden markov models in computational biology: applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994)
Kurihara, K., Sato, T.: An application of the variational Bayesian approach to probabilistic context-free grammars. In: Proceedings of the IJCNLP-04 Workshop Beyond Shallow Analyses (2004)
Kurihara, K., Sato, T.: Variational bayesian grammar induction for natural language. In: Proceedings of the 8th International Colloquium on Grammatical Inference (ICGI-2006), pp. 84–95, Tokyo, 20–22 September 2006
MacKay, D.: Ensemble learning for hidden Markov models. Technical report, Cavendish Laboratory, University of Cambridge (1997)
Manning, C.: Probabilistic parsing using left corner language models. In: Proceedings of the 5th International Conference on Parsing Technologies (IWPT-97), pp. 147–158. MIT, Cambridge (1997)
Mateescu, R., Dechter, R.: The relationship between AND/OR search spaces and variable elimination. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI’05), pp. 380–387, Edinburgh, 26–29 July 2005
McAllester, D., Collins, M., Pereira, F.: Case-factor diagrams for structured probabilistic modeling. In: Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence (UAI’04), pp. 382–391. AUAI, Arlington (2004)
Minato, S., Satoh, K., Sato, T.: Compiling Bayesian networks by symbolic probability calculation based on zero-suppressed bdds. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI’07), pp. 2550–2555, Hyderabad, 6–12 January 2007
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Roark, B., Johnson, M.: Efficient probabilistic top-down and left-corner parsing. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 421–428 (1999)
Sato, T.: Inside-outside probability computation for belief propagation. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI’07), pp. 2605–2610 (2007)
Sato, T.: A glimpse of symbolic-statistical modeling by prism. J. Intell. Inf. Syst. 31(2), 161–176 (2008)
Sato, T., Abe, S., Kameya, Y., Shirai, K.: A separate-and-learn approach to EM learning of PCFGs. In: Proceedings of the 6th Natural Language Processing Pacific Rim Symposium (NLRPS2001), pp. 255–262 (2001)
Sato, T., Kameya, Y.: PRISM: a language for symbolic-statistical modeling. In: Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI’97), pp. 1330–1335 (1997)
Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. 15, 391–454 (2001)
Sato, T., Kameya, Y., Abe, S., Shirai, K.: Fast EM learning of a family of PCFGs. Technical Report (Dept. of CS) TR01-0006, Tokyo Institute of Technology (2001)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Sornlertlamvanich, V., Inui, K., Shirai, K., Tanaka, H., Tokunaga, T., Takezawa, T.: Empirical evaluation of probabilistic glr parsing. In: Proceedings of the Natural Language Processing Pacific Rim Symposium, pp. 169–174 (1997)
Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families. Bioinformatics 14(2), 157–163 (1998)
Uratani, N., Takezawa, T., Matsuo, H., Morita, C.: ATR integrated speech and language database. Technical Report TR-IT-0056, ATR Interpreting Telecommunications Research Laboratories (1994, in Japanese)
Van Uytsel, D., Van Aelten, F., Van Compernolle, D.: Language modeling with probabilistic left corner parsing. Comput. Speech Lang. 19, 171–204 (2005)
Van Uytsel, D., Van Compernolle, D., Wambacq, P.: Maximum-likelihood training of the PLCG-based language model. In: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop 2001 (ASRU’01) (2001)
Wetherell, C.S.: Probabilistic languages: a review and some open questions. Comput. Surv. 12(4), 361–379 (1980)
Zhou, N.F., Sato, T.: Efficient fixpoint computation in linear tabling. In: Proceedings of the 5th ACM-SIGPLAN International Conference on Principles and Practice of Declarative Programming (PPDP’03), pp. 275–283 (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sato, T., Kameya, Y. & Kurihara, K. Variational Bayes via propositionalized probability computation in PRISM. Ann Math Artif Intell 54, 135–158 (2008). https://doi.org/10.1007/s10472-009-9135-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-009-9135-8