Abstract
In connection with two-label classification tasks over the Boolean domain, we consider the possibility to combine the key advantages of Bayesian networks and of kernel-based learning systems. This leads us to the basic question whether the class of decision functions induced by a given Bayesian network can be represented within a low-dimensional inner product space. For Bayesian networks with an explicitly given (full or reduced) parameter collection, we show that the “natural” inner product space has the smallest possible dimension up to factor 2 (even up to an additive term 1 in many cases). For a slight modification of the so-called logistic autoregressive Bayesian network with n nodes, we show that every sufficiently expressive inner product space has dimension at least 2n/4. The main technical contribution of our work consists in uncovering combinatorial and algebraic structures within Bayesian networks such that known techniques for proving lower bounds on the dimension of inner product spaces can be brought into play.
This work has been supported in part by the Deutsche Forschungsgemeinschaft Grant SI 498/7-1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden Markov support vector machines. In: Proceedings of the 20th International Conference on Machine Learning, pp. 3–10. AAAI Press, Menlo Park (2003)
Arriaga, R.I., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. In: Proceedings of the 40th Annual Symposium on the Foundations of Computer Science, pp. 616–623 (1999)
Ben-David, S., Eiron, N., Simon, H.U.: Limitations of learning via embeddings in euclidean half-spaces. Journal of Machine Learning Research 3, 441–461 (2002); An extended abstract of this paper appeared in the Proceedings of the 14th Annual Conference on Computational Learning Theory (COLT 2001)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152. ACM Press, New York (1992)
Chickering, D.M., Heckerman, D., Meek, C.: A Bayesian approach to learning Bayesian networks with local structure. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 80–89. Morgan Kaufman, San Francisco (1997)
Cover, T.M.: Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers 14, 326–334 (1965)
Mc Cullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall, Boca Raton (1983)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, Heidelberg (1996)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley–Interscience. John Wiley & Sons, New York (1973)
Forster, J.: A linear lower bound on the unbounded error communication complexity. Journal of Computer and System Sciences 65(4), 612–625 (2002); An extended abstract of this paper appeared in the Proceedings of the 16th Annual Conference on Computational Complexity (CCC 2001)
Forster, J., Krause, M., Lokam, S.V., Mubarakzjanov, R., Schmitt, N., Simon, H.U.: Relations between communication complexity, linear arrangements, and computational complexity. In: Proceedings of the 21st Annual Conference on the Foundations of Software Technology and Theoretical Computer Science, pp. 171–182 (2001)
Forster, J., Schmitt, N., Simon, H.U., Suttorp, T.: Estimating the optimal margins of embeddings in euclidean half spaces. Machine Learning 51(3), 263–281 (2003); An extended abstract of this paper appeared in the Proceedings of the 14th Annual Conference on Computational Learning Theory (COLT 2001)
Forster, J., Simon, H.U.: On the smallest possible dimension and the largest possible margin of linear arrangements representing given concept classes. In: Proceedings of the 13th International Workshop on Algorithmic Learning Theory, pp. 128–138 (2002)
Frankl, P., Maehara, H.: The Johnson-Lindenstrauss lemma and the sphericity of some graphs. Journal of Combinatorial Theory (B) 44, 355–362 (1988)
Frey, B.J.: Graphical Models for Machine Learning and Digital Communication. MIT Press, Cambridge (1998)
Hajnal, A., Maass, W., Pudlák, P., Szegedy, M., Turán, G.: Threshold circuits of bounded depth. Journal of Computer and System Sciences 46, 129–1154 (1993)
Jaakkola, T.S., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, vol. 11, pp. 487–493. MIT Press, Cambridge (1998)
Jaakkola, T.S., Haussler, D.: Probabilistic kernel regression models. In: Proceedings of the 7th International Workshop on AI and Statistics, Morgan Kaufman, San Francisco (1999)
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipshitz mapping into Hilbert spaces. Contemp. Math. 26, 189–206 (1984)
Kiltz, E.: On the representation of boolean predicates of the Diffie-Hellman function. In: Proceedings of 20th International Symposium on Theoretical Aspects of Computer Science, pp. 223–233 (2003)
Kiltz, E., Simon, H.U.: Complexity theoretic aspects of some cryptographic functions. In: Proceedings of the 9th International Conference on Computing and Combinatorics, pp. 294–303 (2003)
Maass, W., Schnitger, G., Sontag, E.D.: A comparison of the computational power of sigmoid and Boolean theshold circuits. In: Roychowdhury, V., Siu, K.-Y., Orlitsky, A. (eds.) Theoretical Advances in Neural Computation and Learning, pp. 127–151. Kluwer Academic Publishers, Dordrecht (1994)
Neal, R.M.: Connectionist learning of belief networks. Artificial Intelligence 56, 71–113 (1992)
Oliver, N., Schölkopf, B., Smola, A.J.: Natural regularization from generative models. In: Smola, A.J., Bartlett, P.L., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 51–60. MIT Press, Cambridge (2000)
Pearl, J.: Reverend Bayes on inference engines: A distributed hierarchical approach. In: Proceedings of the National Conference on Artificial Intelligence, pp. 133–136. AAAI Press, Menlo Park (1982)
Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research 4, 61–76 (1996)
Saunders, C., Shawe-Taylor, J., Vinokourov, A.: String kernels, Fisher kernels and finite state automata. In: Advances in Neural Information Processing Systems 15, MIT Press, Cambridge (2002)
Schmitt, M.: On the complexity of computing and learning with multiplicative neural networks. Neural Computation 14, 241–301 (2002)
Spiegelhalter, D.J., Knill-Jones, R.P.: Statistical and knowledge-based approaches to clinical decision support systems. Journal of the Royal Statistical Society, 35–77 (1984)
Tsuda, K., Akaho, S., Kawanabe, M., Müller, K.-R.: Asymptotic properties of the Fisher kernel. Neural Computation (2003) (to appear)
Tsuda, K., Kawanabe, M.: The leave-one-out kernel. In: Proceedings of the International Conference on Artificial Neural Networks, pp. 727–732. Springer, Heidelberg (2002)
Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.R.: A new discriminative kernel from probabilistic models. Neural Computation 14(10), 2397–2414 (2002)
Vapnik, V.: Statistical Learning Theory. In: Wiley Series on Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, Chichester (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nakamura, A., Schmitt, M., Schmitt, N., Simon, H.U. (2004). Bayesian Networks and Inner Product Spaces. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-27819-1_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22282-8
Online ISBN: 978-3-540-27819-1
eBook Packages: Springer Book Archive