Abstract
Label-noise robust logistic regression (rLR) is an extension of logistic regression that includes a model of random mislabelling. This paper attempts a theoretical analysis of rLR. By decomposing and interpreting the gradient of the likelihood objective of rLR as employed in gradient ascent optimisation, we get insights into the ability of the rLR learning algorithm to counteract the negative effect of mislabelling as a result of an intrinsic re-weighting mechanism. We also give an upper-bound on the error of rLR using Rademacher complexities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. JMLR 3, 463–482 (2003)
Bootkrajang, J., Kabán, A.: Label-noise robust logistic regression and its applications. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 143–158. Springer, Heidelberg (2012)
Bootkrajang, J., Kabán, A.: Classification of mislabelled microarrays using robust sparse logistic regression. Bioinformatics 29(7), 870–877 (2013)
Brodley, C.E., Friedl, M.A.: Identifying and eliminating mislabeled training instances. In: Proceedings of AAAI 1996, pp. 799–805 (1996)
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)
Chhikara, R.S., McKeon, J.: Linear discriminant analysis with misallocation in training samples. Journal of the American Stat. Assoc. 79(388), 899–906 (1984)
Krishnan, T., Nandy, S.C.: Efficiency of discriminant analysis when initial samples are classified stochastically. Pattern Recognition 23(5), 529–537 (1990)
Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified. Technometrics 8(4), 657–662 (1966)
Lachenbruch, P.A.: Discriminant analysis when the initial samples are misclassified ii: Non-random misclassification models. Technometrics 16(3), 419–424 (1974)
Lawrence, N.D., Schölkopf, B.: Estimating a kernel fisher discriminant in the presence of label noise. In: Proceedings of ICML 2001, pp. 306–313 (2001)
Lugosi, G.: Learning with an unreliable teacher. Pattern Recogn. 25, 79–87 (1992)
Magder, L.S., Hughes, J.P.: Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology 146(2), 195–203 (1997)
Malossini, A., Blanzieri, E., Ng, R.T.: Detecting potential labeling errors in microarrays by data perturbation. Bioinformatics 22(17), 2114–2121 (2006)
Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and handling mislabelled instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)
Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24(7), 1015–1022 (2003)
Shalev-Shwartz, S.: Introduction to machine learning. The Hebrew University of Jerusalem (2009), http://www.cs.huji.ac.il/~shais/Handouts.pdf
Yasui, Y., Pepe, M., Hsu, L., Adam, B.-L., Feng, Z.: Partially supervised learning using an EM-boosting algorithm. Biometrics 60(1), 199–206 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bootkrajang, J., Kabán, A. (2013). Learning a Label-Noise Robust Logistic Regression: Analysis and Experiments. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_69
Download citation
DOI: https://doi.org/10.1007/978-3-642-41278-3_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41277-6
Online ISBN: 978-3-642-41278-3
eBook Packages: Computer ScienceComputer Science (R0)