Abstract
We prove new lower bounds for learning intersections of halfspaces, one of the most important concept classes in computational learning theory. Our main result is that any statistical-query algorithm for learning the intersection of \(\sqrt{n}\) halfspaces in n dimensions must make \(2^{\Omega(\sqrt{n})}\) queries. This is the first non-trivial lower bound on the statistical query dimension for this concept class (the previous best lower bound was n \(^{\Omega(log{\it n})}\)). Our lower bound holds even for intersections of low-weight halfspaces. In the latter case, it is nearly tight.
We also show that the intersection of two majorities (low-weight halfspaces) cannot be computed by a polynomial threshold function (PTF) with fewer than n \(^{\Omega((log{\it n})/loglog{\it n})}\) monomials. This is the first super-polynomial lower bound on the PTF length of this concept class, and is nearly optimal. For intersections of k=ω(logn) low-weight halfspaces, we improve our lower bound to \(\min\{2^{\Omega(\sqrt{n})},n^{\Omega(k/\log k)}\},\) which too is nearly optimal. As a consequence, intersections of even two halfspaces are not computable by polynomial-weight PTFs, the most expressive class of functions known to be efficiently learnable via Jackson’s Harmonic Sieve algorithm. Finally, we report our progress on the weak learnability of intersections of halfspaces under the uniform distribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alekhnovich, M., Braverman, M., Feldman, V., Klivans, A., Pitassi, T.: Learnability and automatizability. In: Proceedings of the 45th Annual Symposium on Foundations of Computer Science (FOCS) (2004)
Aspnes, J., Beigel, R., Furst, M., Rudich, S.: The expressive power of voting polynomials. In: STOC 1991: Proceedings of the twenty-third annual ACM symposium on Theory of computing, pp. 402–409. ACM Press, New York (1991)
Beigel, R., Reingold, N., Spielman, D.: PP is closed under intersection. In: STOC 1991: Proceedings of the twenty-third annual ACM symposium on Theory of computing, pp. 1–9. ACM Press, New York (1991)
Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., Rudich, S.: Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In: STOC 1994: Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, pp. 253–262. ACM Press, New York (1994)
Blum, A., Kalai, A., Wasserman, H.: Noise-tolerant learning, the parity problem, and the statistical query model. J. ACM 50(4), 506–519 (2003)
Bruck, J.: Harmonic analysis of polynomial threshold functions. SIAM J. Discrete Math. 3(2), 168–177 (1990)
Jackson, J.C.: The harmonic sieve: A novel application of Fourier analysis to machine learning theory and practice. PhD thesis, Carnegie Mellon University (1995)
Kearns, M.: Efficient noise-tolerant learning from statistical queries. In: STOC 1993: Proceedings of the twenty-fifth annual ACM symposium on theory of computing, pp. 392–401. ACM Press, New York (1993)
Klivans, A., O’Donnell, R., Servedio, R.: Learning intersections and thresholds of halfspaces. In: Proceedings of the 43rd Annual Symposium on Foundations of Computer Science, pp. 177–186 (2002)
Klivans, A.R., Servedio, R.A.: Learning intersections of halfspaces with a margin. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS, vol. 3120, pp. 348–362. Springer, Heidelberg (2004)
Krause, M., Pudlák, P.: On the computational power of depth 2 circuits with threshold and modulo gates. In: STOC 1994: Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, pp. 48–57. ACM Press, New York (1994)
Kushilevitz, E., Mansour, Y.: Learning decision trees using the fourier spectrum. In: STOC 1991: Proceedings of the twenty-third annual ACM symposium on Theory of computing, pp. 455–464. ACM Press, New York (1991)
Kwek, S., Pitt, L.: PAC learning intersections of halfspaces with membership queries. Algorithmica 22(1/2), 53–75 (1998)
Minsky, M.L., Papert, S.A.: Perceptrons: expanded edition. MIT Press, Cambridge (1988)
O’Donnell, R., Servedio, R.A.: New degree bounds for polynomial threshold functions. In: STOC 2003: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pp. 325–334. ACM Press, New York (2003)
Saks, M.E.: Slicing the hypercube. Surveys in combinatorics, 211–255 (1993)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Vempala, S.: A random sampling based algorithm for learning the intersection of halfspaces. In: Proceedings of the 38th Annual Symposium on Foundations of Computer Science, pp. 508–513 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klivans, A.R., Sherstov, A.A. (2006). Improved Lower Bounds for Learning Intersections of Halfspaces. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_26
Download citation
DOI: https://doi.org/10.1007/11776420_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)