Abstract
In many applications, an enormous amount of unlabeled data is available with little cost. Therefore, it is natural to ask whether we can take advantage of these unlabeled data in classification learning. In this paper, we analyzed the role of unlabeled data in the context of naive Bayesian learning. Experimental results show that including unlabeled data as part of training data can significantly improve the performance of classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Castelli, V., Cover, T.M.: On the Exponential Value of Labeled Samples. Pattern Recognition Letters 16, 105–111 (1995)
Cohn, D., et al.: Active Learning with Statistical Models. Journal of Artificial Intelligence Research 4, 129–145 (1996)
De Comite, F., et al.: Positive and Unlabeled Examples Help Learning. In: Tenth Int’l Conf. on Algorithmic Learning Theory, pp. 219–230 (1999)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society 39, 1–38 (1977)
Duda, R., et al.: Pattern Classification, 2nd edn. John Wiley, Chichester (2001)
Goldman, S., Zhou, Y.: Enhancing Supervised Learning with Unlabeled Data. In: Int’l Conf. on Machine Leaning (2000)
Hofmann, T.: Text Categorization with Labeled and Unlabeled Data: A Generative Model Approach. In: NIPS 1999 Workshop on Using Unlabeled Data for Supervised Learning (1999)
Liere, R., Tadepalli, P.: Active Learning with Committees for Text Categorization. In: 14th National Conf. on Artificial Intelligence, pp. 591–596 (1997)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Mitchell, T.: The Role of Unlabeled Data in Supervised Learning. In: 6th Int’l Colloquium on Cognitive Science (1999)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents Using EM. Machine Learning 39, 103–134 (2000)
Nigam, K., Ghani, R.: Analyzing the Effectiveness and Applicability of Co-training. In: CIKM 2000 (2000)
Zhang, T.: Some Asymptotic Results Concerning the Value of Unlabeled Data. In: NIPS 1999 Workshop on Using Unlabeled Data for Supervised Learning (1999)
Zhou, Y., Goldman, S.: Enhancing Supervised Learning with Unlabeled Data. In: 17th Int’l Conf. On Machine Learning, pp. 327–334 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, CH. (2006). A Semi-naive Bayesian Learning Method for Utilizing Unlabeled Data. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2006. Lecture Notes in Computer Science(), vol 4251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11892960_23
Download citation
DOI: https://doi.org/10.1007/11892960_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46535-5
Online ISBN: 978-3-540-46536-2
eBook Packages: Computer ScienceComputer Science (R0)