Abstract
The PAC model of learning and its extension to real valued function classes provides a well accepted theoretical framework for representing the problem of machine learning using randomly drawn examples. Quite often in practice some form of a priori partial information about the target is available in addition to randomly drawn examples. In this paper we extend the PAC model to a scenario of learning with partial information in addition to randomly drawn examples. According to this model partial information effectively reduces the complexity of the hypothesis class used to learn the target thereby reducing the sample complexity of the learning problem. This leads to a clear quantitative tradeoff between the amount of partial information and the sample complexity of the problem. The underlying framework is based on a combination of information-based complexity theory (cf. Traub et. al. [18]) and Vapnik-Chervonenkis theory. A new quantity I n,d(F) which plays an important role in determining the worth of partial information is introduced. It measures the minimal approximation error of a target in a class F by the family of all function classes of pseudo-dimension d under a given partial information which consists of any n measurements which may be expressed as linear operators. As an application, we consider the problem of learning a Sobolev target class. The tradeoff between the amount of partial information and the sample complexity is calculated and by obtaining fairly tight upper and lower bounds on I n,d we identify an almost optimal way of providing partial information.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abu-Mostafa Y. S. (1990), Learning from Hints in Neural Networks, Journal of Complexity, 6, p.192–198.
Abu-Mostafa Y. S. (1993), Hints and the VC Dimension, Neural Computation, Vol. 5, p.278–288.
Abu-Mostafa Y. S. (1995), Machines that Learn from Hints, Scientific American, Vol. 272, No. 4.
Angluin D., (1988), Queries and Concept Learning, Machine Learning, Vol 2, p. 319–342.
Bartlett P. L., Long P. M., Williamson R. C., (1994), Fat-Shattering and the Learnability of Real-Valued Functions, Proceedings of the 7 th Annual Conference on Computational Learning Theory, p. 299, 1994, ACM, New York, N.Y..
Birman M. S., Solomjak M. Z., (1967), Piecewise-polynomial Approximations of Functions of the Classes W αp , Math. USSR-Sbornik, Vol. 2, No. 3, p.295–317.
Castelli, V., Cover T. M., (1995), On the exponential value of labeled samples, Pattern Recognition Letters, Vol. 16, No. 1, p.105.
Haussler D., (1992), Decision theoretic generalizations of the PAC model for neural net and other learning applications, Inform. Comput., vol. 100 no. 1, pp. 78–150.
Kulkarni S. R., Mitter S. K., Tsitsiklis J. N., (1993). Active Learning Using Arbitrary Valued Queries. Machine Learning, Vol 11, p.23–35
Mitchell T. M., Keller R., Kedar-Cabelli S. (1986) Explanation-based generalization: A unifying view. Machine Learning, Vol. 1, pp. 47–80.
Pinkus A., (1985), “n-widths in Approximation Theory”, New York: Springer-Verlag.
Ratsaby J., Venkatesh S.S., Learning from a mixture of labeled and unlabeled examples with parametric side information. (1995). Proc. Eighth Annual Conference on Computational Learning Theory, p.412, Morgan Kaufmann, San Maeto, CA.
Ratsaby J., Venkatesh S. S. The complexity of Learning from a Mixture of Labeled and Unlabeled Examples. (1995). Proc. 33rd Allerton Conference on Communication, Control, and Computing, (p. 1002–1009).
Ratsaby J., Maiorov V., (1996), Learning from Examples and Partial Information, Submitted to Journal of Complexity.
Rivest R. L., Eisenberg B. (1990), On the sample complexity of pac-learning using random and chosen examples. Proceedings of the 1990 Workshop on Computational Learning Theory, p. 154–162, Morgan Kaufmann, San Maeto, CA.
Roscheisen M., Hofmann R., Trespt V. (1994). Incorporating Prior Knowledge into Networks of Locally-Tuned Units, in “Computational Learning Theory and Natural Learning Systems”, Hanson S., Petsche T., Kearns M., Rivest R., Editors, MIT Press, Cambridge, MA.
Towell G. G., Shavlik J. W. (1991). Interpretation of artificial neural networks: Mapping knowledge-based neural networks into rules. In Advances in Neural Information Processing Systems 4, PP. 977–984., Denver, CO. Morgan Kaufmann.
Traub J. F., Wasilkowski G. W., Wozniakowski H., (1988), “Information-Based Complexity”, Academic Press Inc.
Valiant L. G., A Theory of the learnable, (1984), Comm. ACM 27:11, p. 1134–1142.
Vapnik V. N. and Chervonenkis A. Ya., (1971), “On the uniform convergence of relative frequencies of events to their probabilities”, Theoret. Probl. and Its Appl. Vol. 16, 2, p.264–280.
Vapnik V. N. and Chervonenkis A. Ya., (1981), Necessary and sufficient conditions for the uniform convergence of means to their expectations, Theoret. Probl. and Its Appl. Vol. 26, 3, p.532–553.
Vapnik V.N., (1982), “Estimation of Dependences Based on Empirical Data”, Springer-Verlag, Berlin.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ratsaby, J., Maiorov, V. (1997). Generalization of the PAC-model for learning with partial information. In: Ben-David, S. (eds) Computational Learning Theory. EuroCOLT 1997. Lecture Notes in Computer Science, vol 1208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62685-9_6
Download citation
DOI: https://doi.org/10.1007/3-540-62685-9_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62685-5
Online ISBN: 978-3-540-68431-2
eBook Packages: Springer Book Archive