article

Free access

Learnability and the Vapnik-Chervonenkis dimension

Authors:

A. Ehrenfeucht,

David Haussler,

Manfred K. WarmuthAuthors Info & Claims

Journal of the ACM (JACM), Volume 36, Issue 4

Pages 929 - 965

https://doi.org/10.1145/76359.76371

Published: 01 October 1989 Publication History

Abstract

Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space Eⁿ. The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufficient conditions are provided for feasible learnability.

References

[1]

AHO, A., HOPCROFT, J., AND ULLMAN, j. The Design and Analysis of Computer Algorithms. Addison-Wesley, London, 1974.

[2]

ANGLUIN, D. Learning regular sets from queries and counterexamples. Inf. Comput. 75 (1987), 87-106.

[3]

ANGLUIN, D. Queries and concept learning. Mach. Learning 2, 2 (1988), 319-342.

[4]

ANGLUIN, D., AND LAIRD, P. D. Learning from noisy examples. Mach. Learning 2, 2 (1988), 343-370.

[5]

ANGLUIN, D., AND VALIANT, L. Fast probabilistic algorithms for Hamiltonian circuits and matchings. J. Comput. Syst. Sci. 19 (1979), 155-193.

[6]

ASSOUAD, P. Densit6 et Dimension. Ann. Inst. Fourier, Grenoble 33, 3 (1983), 233-282.

[7]

BAUM, E., AND HAUSSLER, D. What size net gives valid generalization. Neural Comput. 1, 1 (1989) 151-160.

[8]

BEN-DAvID, S., BENEDEK, G., AND MANSOUR, Y. A parameterization scheme for classifying models of learnability. In Proceedings of the 2nd Workshop of Computational Learning Theory (Santa Cruz, Calif., July 31-Aug. 2). Morgan Kaufman, San Mateo, Calif., 1989, to appear.

[9]

BENEDEK, G., AND ITAI, A. Nonuniform learnability. In Proceedings of the 15th International Conference on Automata, Languages and Programming (July). 1988, pp. 82-92.

[10]

BLUM, A., ANt) RIVEST, R. Training a 3-node neural network is NP-complete. In Proceedings of the 1st Workshop on Computational Learning Theory (Cambridge, Mass., Aug. 3-5). Morgan Kaufmann, San Mateo, Calif., 1988, pp.9-18.

[11]

BLUMER, A., EHRENFEUCHT, m., HAUSSLER, D., AND WARMUTH, M. K. Occam's Razor. Inf. Process. Lett. 24 (1987), 377-380.

[12]

BOLLOB/~S, B. Combinatorics. Cambridge Univ. Press, Cambridge, Mass., 1986.

[13]

BOUCHERON, S., AND SALLANTIN, J. Some remarks about space-complexity of learning and circuit complexity of recognizing. In Proceedings of the 1st Workshop on Computational Learning Theory (Cambridge, Mass., Aug. 3-5). Morgan Kaufmann, San Mateo, Calif., 1988, pp. 125-138.

[14]

COVER, T. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Trans. Elect. Comp. 14 (1965), 326-334.

[15]

DEVROYE, L.P. Automatic pattern recognition: A study of the probability of error. IEEE Trans. Pattern Analysis and Mach. Intell. 10, 4 (1988), 530-543.

[16]

DEVROYE, L. P., AND WAGNER, T.J. A distribution-free performance bound in error estimation. IEEE Trans. inf. Theorem 22 (1976), 586-587.

[17]

DUDA, R., AND HART, P. Pattern Classification and Scene Analysis. Wiley, New York, 1973.

[18]

DUDLEY, R.M. A course on empirical processes. In Lecture Notes in Mathematics, vol. 1097. Springer-Verlag, New York, 1984.

[19]

DUDLEY, R. M. Universal Donsker classes and metric entropy. Ann. Prob. 15, 4 (1987), 1306-1326.

[20]

EDELSBRUNNER, H., AND PREr'ARATA, F.P. Minimal polygonal separation. Inf. Comput. 77 (1988), 218-232.

[21]

EHRENFEUCHT, A., HAUSSLER, D., KEARNS, M., AND VALIANT, L.G. A general lower bound on the number of examples needed for learning. Inf. Comput., to appear.

[22]

GARE~, M., AND JOHNSON, D. Computers and Intractability: A Guide to the Theory of NP- Completeness. W. H. Freeman, San Francisco, 1979.

[23]

GII~L, J. Probabilistic Turing machines. SlAM J. Comput. 6, 4 (1977), 675-695.

[24]

Gm~, E., AND ZINN, J. Lectures on the central limit theorem for empirical processes. In Lecture Notes in Mathematics, vol. 122 I. Springer-Verlag, New York, 1986.

[25]

GOLDREICH, O., GOLDWASSER, S., AND MICALI, S. How to construct random functions. J. ACM 33, 4 (Oct. 1986), 792-807.

[26]

HAMPSON, S. E., AND VOLPER, D. Linear Function neurons: Structure and training. Biol. Cyber. 53 (1986), 203-217.

[27]

HAUSSLER, D. Learning conjunctive concepts in structural domains. Mach. Learning 4, 1 (1989)

[28]

HA USSLER, D. Applying Valiant's learning framework to AI concept learning problems. In Proceedings of the 4th International Workshop on Machine Learning. Morgan Kaufinann, San Mateo, Calif., 1987, pp. 324-336.

[29]

HA USSLER, D. Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artif Inteil. 36 (1988), 177-221.

[30]

HA USSLER, D. Generalizing the PAC model: Sample size bounds from metric dimension-based uniform convergence results. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science (Research Triangle Park, N.C., Oct. 30-Nov. 1). IEEE, New York:, 1989, to appear.

[31]

HA USSLER, D., AND WELZL, E. Epsilon-nets and simplex range queries. Disc. Comput. Geometry 2 (1987), 127-151.

[32]

HAUSSLER, D., KEARNS, M., LITTLESTONE, N., AND WARMUTH, M.K. Equivalence of models for polynomial learnability. Tech. Rep. UCSC-CRL-88-06. Univ. California at Santa Cruz, Santa Cruz, Calif., 1988.

[33]

HAUSSLER, D., LITTLESTONE, N., AND WARMUTH, M.K. Predicting {0, 1}-functions on~ randomly drawn points. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 100-109.

[34]

JOHNSON, D.S. Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci. 9 (1974), 256-278.

[35]

KARMARKAR, N. A new polynomial-time algorithm for linear programming. Comb;natorica 4 (1984), 373-395.

[36]

KEARNS, M., AND LI, M. Learning in the presence of malicious errors. In Proceedings of the 20th Symposium on Theory of Computing (Chicago, Ill., May 2-4). ACM, New York, 1988, pp. 267-280.

[37]

KEARNS, M., AND VALIANT, L. Cryptographic limitations on learning Boolean formulae and finite automata. In Proceedings of the 21st ACM Symposium on Theory of Computing (Seattle, Wash., May 15-17). ACM, New York, 1989, pp. 433-444.

[38]

KEARNS, M., El, M., PITT, L., AND VALIANT, L. On the learnability of Boolean formulae. In Proceedings of the 19th ACM Symposium on Theory of Computation (New York, N.Y., May 25-27). ACM, New York, 1987, pp. 285-295.

[39]

KEARNS, M., LI, M., PITT, L., AND VALIANT, L. Recent results on Boolean concept learning. In Proceedings of the 4th International Workshop on Machine Learning. Morgan-Kaufinann, San Mateo, Calif., 1987, pp. 337-352.

[40]

KNUTH, D.E. The Art of Computer Programming, 2nd ed., voI. I. Addison-Wesley, Reading, Mass., 1973.

[41]

LAIRD, P.D. Learning from good data and bad. Tech. Rep. YALEU/DCS/TR-551. Yale Univ., New Haven, Conn., 1987.

[42]

LEE, D. T., AND PREPARATA, F.P. Computational geometry--A survey. IEEE Trans. Comput. 33, 12 (1984), 1072-1101.

[43]

LINIAL, N., MANSOUR, Y., AND RWEST, R. Results on learnability and the Vapnik-Chervonenkis dimension. In Proceedings of the 29th IEEE Symposium on Foundations of Computer Science (White Plains, N.Y., Oct.). IEEE, New York, 1988, pp. 120-129.

[44]

LITTLESTONE, N. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Mach. Learning 2, 2 (I988), 285-318.

[45]

MASEK, W.J. Some NP-complete set cover problems. MIT Laboratory for Computer Science, unpublished manuscript.

[46]

MEGIDDO, N. Linear programming in linear time when the dimension is fixed. J. ACM 31, 1 (Jan. 1984), 114-127.

[47]

MEGIDDO, N. On the complexity of polyhedral separability. Discrete Comput. Geometry 3 (1988), 325-337.

[48]

MUROGA, S. Threshold Logic and Its Applications. Wiley, New York, 1971.

[49]

NATARAJAN, B. K. On learning Boolean functions. In Proceedings of the 19th Amtual ACM Symposium on Theory of Computation (New York, N.Y., May 25-27). ACM, New York, 1987, pp. 296-304.

[50]

NATARAJAN, B.K. Learning functions from examples. Tech. Rep. CMU-RI-TR-87-19. Carnegie Mellon Univ., Pittsburgh, Pa., Aug. 1987.

[51]

NIGMATULLIN, R. G. The fastest descent method for coveting problems (in Russian). In Proceedings of a Symposium on Questions of Precision and Efficiency of Computer Algorithms, Book 5. Kiev, 1969, pp. 116-126.

[52]

PEARL, J. On the connection between the complexity and credibility of inferred models. Int. J. Gen. Syst. 4 (1978), 255-264.

[53]

PEARL, J. Capacity and error estimates for Boolean classifiers with limited capacity. IEEE Trans. Pattern Analysis Mach. Intell. 1, 4 (1979), 350-355.

[54]

PITT, L., AND VALIANT, L.G. Computational limitations on learning from examples. J. ACM 35, 4 (Oct. 1988), 965-984.

[55]

PITT, L., AND WARMUTH, M. Reductions among prediction problems, on the difficulty of predicting automata. In Proceedings of the 3rd IEEE Structure in Complexity Theory Conference (Washington, D.C.). IEEE, New York, 1988, pp. 62-69.

[56]

POLLARD, D. Convergence of Stochastic Processes. Springer-Verlag, New York, 1984.

[57]

QUINLAN, R., AND RIVEST, R. Inferring decision trees using the minimum description length principle. Inf. Comput., to appear.

[58]

RIVEST, R. Learning decision-lists. Mach. Learning 2, 3 (1987), 229-246.

[59]

ROYDEN, H.L. RealAnalysis, 2nd ed. MacMillan, New York, 1968.

[60]

SLOAN, R. Types of noise in data for concept learning. In Proceedings of the 1st Workshop on Computational Learning Theory (Cambridge, Mass., Aug. 3-5). Morgan Kaufmann, San MateD, Calif., 1988, pp. 91-96.

[61]

VALIANT, L.G. A theory of the learnable. Commun. ACM 27, 11 (Nov. I984), 1134-1142.

[62]

VALIANT, L.G. Learning disjunctions of conjunctions. In Proceedings of the 9th International Conference on Artificial Intelligence (Los Angeles, Calif., Aug.), vol. 1. Morgan Kaufmann, San MateD, Calif., 1985, pp. 560-566.

[63]

VAPNIK, V.N. Estimation of Dependences Based on Empirical Data. Springer Verlag, New York, 1982.

[64]

VAPNIK, V. N., AND CHERVONENKIS, A. YA. On the uniform convergence of relative frequencies of events to their probabilities. Theoret. Probi. and its Appl. 16, 2 ( 1971), 264-280.

[65]

VAPNIK, V. N., AND CHERVONENKIS, A. YA. Theory of Pattern Recognition (in Russian). Nauka, Moscow, 1974.

[66]

VITTER, J., AND LIN, J. H. Learning in parallel. In Proceedings of the 1st Workshop on Computational Learning Theory (Cambridge, Mass., Aug. 3-5). Morgan Kaufmann, San MateD, Calif., 1988, pp. 106-124.

[67]

WATANABE, S. Pattern recognition as information compression. In Frontiers of Pattern Recognition, S. Watanabe, Ed. Academic Press, Orlando, Fla., 1972.

[68]

WENOCUR, R. S., AND DUDLEY, R.M. Some special Vapnik-Chervonenkis classes. Discrete Math. 33 (1981), 313-318.

[69]

WINSTON, P. Learning structural descriptions from examples. In The Psychology of Computer Vision. McGraw-Hill, New York, 1975.

Cited By

Theodoridis S(2026)Learning in parametric modeling: basic concepts and directionsMachine Learning10.1016/B978-0-44-329238-5.00009-3(65-117)Online publication date: 2026
https://doi.org/10.1016/B978-0-44-329238-5.00009-3
Zhao JKong LLv J(2025)An Overview of Deep Neural Networks for Few-Shot LearningBig Data Mining and Analytics10.26599/BDMA.2024.90200498:1(145-188)Online publication date: Feb-2025
https://doi.org/10.26599/BDMA.2024.9020049
Gao SHan Z(2025)Smart OCSSmart Traction Power Supply Systems for High-speed Railways10.1016/B978-0-443-33323-1.00004-6(95-240)Online publication date: 2025
https://doi.org/10.1016/B978-0-443-33323-1.00004-6
Show More Cited By

Index Terms

Learnability and the Vapnik-Chervonenkis dimension

Recommendations

Results on learnability and the Vapnik-Chervonenkis dimension
SFCS '88: Proceedings of the 29th Annual Symposium on Foundations of Computer Science

The problem of learning a concept from examples in a distribution-free model is considered. The notion of dynamic sampling, wherein the number of examples examined can increase with the complexity of the target concept, is introduced. This method is ...
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension

Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ⊆ 2^X consists of a compression function and a ...
SAMPLE COMPRESSION, LEARNABILITY, AND THE VAPNIK-CHERVONENKIS DIMENSION

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of the ACM

Journal of the ACM Volume 36, Issue 4

Oct. 1989

279 pages

ISSN:0004-5411

EISSN:1557-735X

DOI:10.1145/76359

Issue’s Table of Contents

Copyright © 1989 ACM.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1989

Published in JACM Volume 36, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,334
Total Citations
View Citations
9,202
Total Downloads

Downloads (Last 12 months)1,425
Downloads (Last 6 weeks)154

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Theodoridis S(2026)Learning in parametric modeling: basic concepts and directionsMachine Learning10.1016/B978-0-44-329238-5.00009-3(65-117)Online publication date: 2026
https://doi.org/10.1016/B978-0-44-329238-5.00009-3
Zhao JKong LLv J(2025)An Overview of Deep Neural Networks for Few-Shot LearningBig Data Mining and Analytics10.26599/BDMA.2024.90200498:1(145-188)Online publication date: Feb-2025
https://doi.org/10.26599/BDMA.2024.9020049
Gao SHan Z(2025)Smart OCSSmart Traction Power Supply Systems for High-speed Railways10.1016/B978-0-443-33323-1.00004-6(95-240)Online publication date: 2025
https://doi.org/10.1016/B978-0-443-33323-1.00004-6
Shamsi MBeheshti S(2025)Separability and scatteredness (S&S) ratio-based efficient SVM regularization parameter, kernel, and kernel parameter selectionPattern Analysis & Applications10.1007/s10044-025-01411-228:1Online publication date: 27-Jan-2025
https://dl.acm.org/doi/10.1007/s10044-025-01411-2
Li BWang WYe PSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Improved bounds for pure private agnostic learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693229(28870-28889)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693229
Boopathy AYue WHwang JIyer AFiete ILarson K(2024)Towards exact computation of inductive biasProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/413(3733-3741)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/413
Bennouna APachamanova DPerakis GSkali Lami O(2024)Learning the Minimal Representation of a Continuous State-Space Markov Decision Process from Transition DataManagement Science10.1287/mnsc.2022.01652Online publication date: 26-Sep-2024
https://doi.org/10.1287/mnsc.2022.01652
ten Cate BFunk MChristoph Jung JLutz C(2024)Fitting Algorithms for Conjunctive QueriesACM SIGMOD Record10.1145/3641832.364183452:4(6-18)Online publication date: 19-Jan-2024
https://dl.acm.org/doi/10.1145/3641832.3641834
Han RKramer BLee DNarayan AXu Y(2024)An Approximate Control Variates Approach to Multifidelity Distribution EstimationSIAM/ASA Journal on Uncertainty Quantification10.1137/23M158430712:4(1349-1388)Online publication date: 14-Nov-2024
https://doi.org/10.1137/23M1584307
Amosy OEyal GChechik G(2024)Late to the party? On-demand unlabeled personalized federated learning2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00218(2173-2182)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00218
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents