Abstract
Standard supervised classification learns a classifier from a set of labeled examples. Alternatively, in the field of weakly supervised classification different frameworks have been presented where the training data cannot be certainly labeled. In this paper, the novel problem of learning from positive-unlabeled proportions is presented. The provided examples are unlabeled and the only class information available consists of the proportions of positive and unlabeled examples in different subsets of the training dataset. An expectation-maximization method that learns Bayesian network classifiers from this kind of data is proposed. A set of experiments has been designed with the objective of shedding light on the capability of learning from this kind of data throughout different scenarios of increasing complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Achache, H., Revel, A.: Endometrial receptivity markers, the journey to successful embryo implantation. Hum. Reprod. Update 12(6), 731–746 (2006)
Brooks, S.P.: Markov chain monte carlo method and its application. J. R. Stat. Soc. Ser. D-Statist. 47(1), 69–100 (1998)
Calvo, B., Larrañaga, P., Lozano, J.A.: Learning Bayesian classifiers from positive and unlabeled examples. Pattern Recogn. Lett. 28(16), 2375–2384 (2007)
Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. The MIT Press, Cambridge (2006)
Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9(4), 309–347 (1992)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B-Stat. Methodol. 39(1), 1–38 (1977)
Ebner, T., Moser, M., Sommergruber, M., Tews, G.: Selection based on morphological assessment of oocytes and embryos at different stages of preimplantation development: a review. Hum. Reprod. Update 9(3), 251–262 (2003)
Friedman, N.: Learning belief networks in the presence of missing values and hidden variables. In: Proceedings of the 14th ICML, pp. 125–133 (1997)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)
Gilks, W.R., Richardson, S., Spiegelhalter, D.J.: Markov Chain Monte Carlo in Practice. Chapman & Hall, London (1996)
Hand, D.J., Yu, K.: Idiot’s Bayes–not so stupid after all? Int. Stat. Rev. 69(3), 385–398 (2001)
Heckerman, D.: A tutorial on learning with bayesian networks. Technical report MSR-TR-95-06, Learning in Graphical Models (1995)
Hernández-González, J., Inza, I., Crisol-Ortíz, L., Guembe, M.A., Iñarra, M.J., Lozano, J.A.: Novel weakly supervised classification techniques for human assisted reproduction: a case study. Stat. Med. (2015, Submitted)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. (Wiley Series in Probability and Statistics). Wiley-Interscience, New York (1997)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Musicant, D.R., Christensen, J.M., Olson, J.F.: Supervised learning by training on aggregate outputs. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), pp. 252–261 (2007)
Patrizi, G., Manna, C., Moscatelli, C., Nieddu, L.: Pattern recognition methods in human-assisted reproduction. Int. Trans. Oper. Res. 11(4), 365–379 (2004)
Quadrianto, N., Smola, A.J., Caetano, T.S., Le, Q.V.: Estimating labels from label proportions. J. Mach. Learn. Res. 10, 2349–2374 (2009)
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 335–338 (1996)
Weidmann, N., Frank, E., Pfahringer, B.: A two-level learning method for generalized multi-instance problems. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 468–479. Springer, Heidelberg (2003)
Zhu, X., Goldberg, A.B.: Introduction to Semi-Supervised Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, San Rafael (2009)
Acknowledgments
This work has been partially supported by the Basque Government (IT609-13) and the Spanish Ministry of Economy and Competitiveness MINECO (TIN2013-41272-P). Jerónimo Hernández-González holds a grant (FPU) from the Spanish Ministry of Education, Culture and Sports.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hernández-González, J., Inza, I., Lozano, J.A. (2015). A Novel Weakly Supervised Problem: Learning from Positive-Unlabeled Proportions. In: Puerta, J., et al. Advances in Artificial Intelligence. CAEPIA 2015. Lecture Notes in Computer Science(), vol 9422. Springer, Cham. https://doi.org/10.1007/978-3-319-24598-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-24598-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24597-3
Online ISBN: 978-3-319-24598-0
eBook Packages: Computer ScienceComputer Science (R0)