Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11503415_8guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A PAC-Style model for learning from labeled and unlabeled data

Published: 27 June 2005 Publication History

Abstract

There has been growing interest in practice in using unlabeled data together with labeled data in machine learning, and a number of different approaches have been developed. However, the assumptions these methods are based on are often quite distinct and not captured by standard theoretical models. In this paper we describe a PAC-style framework that can be used to model many of these assumptions, and analyze sample-complexity issues in this setting: that is, how much of each type of data one should expect to need in order to learn well, and what are the basic quantities that these numbers depend on. Our model can be viewed as an extension of the standard PAC model, where in addition to a concept class C, one also proposes a type of compatibility that one believes the target concept should have with the underlying distribution. In this view, unlabeled data can be helpful because it allows one to estimate compatibility over the space of hypotheses, and reduce the size of the search space to those that, according to one's assumptions, are a-priori reasonable with respect to the distribution. We discuss a number of technical issues that arise in this context, and provide sample-complexity bounds both for uniform convergence and ε-cover based algorithms. We also consider algorithmic issues, and give an efficient algorithm for a special case of co-training.

References

[1]
M. F. Balcan, A. Blum, and K. Yang. Co-training and expansion: Towards bridging theory and practice. In NIPS, 2004.
[2]
G.M. Benedek and A. Itai. Learnability with respect to a fixed distribution. Theoretical Computer Science, 86:377-389, 1991.
[3]
A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph mincuts. In Proc. ICML, pages 19-26, 2001.
[4]
A. Blum, A. Frieze, R. Kannan, and S. Vempala. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica, 22:35-52, 1998.
[5]
A. Blum and T. M. Mitchell. Combining labeled and unlabeled data with cotraining. In Proc. 11th Annual Conf. Computational Learning Theory, pages 92- 100, 1998.
[6]
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik Chervonenkis dimension. Journal of the ACM, 36(4):929-965, 1989.
[7]
S. Boucheron, O. Bousquet, and G. Lugosi. Theory of classification: a survey of recent advances. Manuscript, 2004.
[8]
S. Boucheron, G. Lugosi, and P. Massart. A sharp concentration inequality with applications. Random Structures and Algorithms, 16:277-292, 2000.
[9]
V. Castelli and T.M. Cover. On the exponential value of labeled samples. Pattern Recognition Letters, 16:105-111, 1995.
[10]
V. Castelli and T.M. Cover. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Transactions on Information Theory, 42(6):2102-2117, 1996.
[11]
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer-Verlag, 1996.
[12]
J. Dunagan and S. Vempala. Optimal outlier removal in high-dimensional spaces. In Proceedings of the 33rd ACM Symposium on Theory of Computing, 2001.
[13]
A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. Inf. and Comput, 82:246-261, 1989.
[14]
A. Flaxman. Personal communication, 2003.
[15]
R. Hwa, M. Osborne, A. Sarkar, and M. Steedman. Corrected co-training for statistical parsers. In ICML-03 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, Washington D.C., 2003.
[16]
T. Joachims. Transductive inference for text classification using support vector machines. In Proc. ICML, pages 200-209, 1999.
[17]
A. Levin, P. Viola, and Y. Freund. Unsupervised improvement of visual detectors using co-training. In Proc. 9th Int. Conf. Computer Vision, pages 626-633, 2003.
[18]
K. Nigam, A. McCallum, S. Thrun, and T.M. Mitchell. Text classification from labeled and unlabeled documents using EM. Mach. Learning, 39(2/3):103-134, 2000.
[19]
S.-B. Park and B.-T. Zhang. Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information. Information Processing and Management, 40(3):421 - 439, 2004.
[20]
J. Shawe-Taylor, P. L. Bartlett, R. C. Williamson, and M. Anthony. Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5):1926-1940, 1998.
[21]
L.G. Valiant. A theory of the learnable. Commun. ACM, 27(11):1134-1142, 1984.
[22]
V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons Inc., 1998.
[23]
D. Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. In Meeting of the Association for Computational Linguistics, pages 189-196, 1995.
[24]
X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In Proc. ICML, pages 912-912, 2003.

Cited By

View all
  • (2023)Learning with explanation constraintsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668294(49883-49926)Online publication date: 10-Dec-2023
  • (2023)On regularization and inference with label constraintsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619893(35740-35762)Online publication date: 23-Jul-2023
  • (2020)Negative sampling in semi-supervised learningProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525097(1704-1714)Online publication date: 13-Jul-2020
  • Show More Cited By
  1. A PAC-Style model for learning from labeled and unlabeled data

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      COLT'05: Proceedings of the 18th annual conference on Learning Theory
      June 2005
      690 pages
      ISBN:3540265562
      • Editors:
      • Peter Auer,
      • Ron Meir

      Sponsors

      • Pascal
      • Google Inc.
      • Machine Learning Journal/Springer
      • BiCi

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 27 June 2005

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 13 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Learning with explanation constraintsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668294(49883-49926)Online publication date: 10-Dec-2023
      • (2023)On regularization and inference with label constraintsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619893(35740-35762)Online publication date: 23-Jul-2023
      • (2020)Negative sampling in semi-supervised learningProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525097(1704-1714)Online publication date: 13-Jul-2020
      • (2019)Learning from Incomplete and Inaccurate SupervisionProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330902(1017-1025)Online publication date: 25-Jul-2019
      • (2018)Never-ending learningCommunications of the ACM10.1145/319151361:5(103-115)Online publication date: 24-Apr-2018
      • (2016)Multi-step learning and underlying structure in statistical modelsProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157382.3157635(4815-4823)Online publication date: 5-Dec-2016
      • (2015)Reducing the Unlabeled Sample Complexity of Semi-Supervised Multi-View LearningProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2783409(627-634)Online publication date: 10-Aug-2015
      • (2015)Toward a Unified Theory of Sparse Dimensionality Reduction in Euclidean SpaceProceedings of the forty-seventh annual ACM symposium on Theory of Computing10.1145/2746539.2746541(499-508)Online publication date: 14-Jun-2015
      • (2012)Inductive multi-task learning with multiple view dataProceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2339530.2339617(543-551)Online publication date: 12-Aug-2012
      • (2010)Co-regularization based semi-supervised domain adaptationProceedings of the 24th International Conference on Neural Information Processing Systems - Volume 110.5555/2997189.2997243(478-486)Online publication date: 6-Dec-2010
      • Show More Cited By

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media