research-article

Noise label learning through label confidence statistical inference

Authors:

Fan MinAuthors Info & Claims

Volume 227, Issue C

https://doi.org/10.1016/j.knosys.2021.107234

Published: 05 September 2021 Publication History

Abstract

Noise label exists widely in real-world data, resulting in the degradation of classification performance. Popular methods require a known noise distribution or additional cleaning supervision, which is usually unavailable in practical scenarios. This paper presents a theoretical statistical method and designs a label confidence inference (LISR) algorithm to handle this issue. For data distribution, we define a statistical function for label inconsistency and analyze its relationship with neighbor radius. For data representation, we define trusted-neighbor, nearest-trusted-neighbor and untrusted-neighbor. For noisy label recognition, we present three inference methods to predict the labels and their confidence. The LISR algorithm establishes a practical statistical model, queries the initial trusted instances, iteratively searches for the trusted instances and corrects labels. We conducted experiments on synthetic, UCI and classic image datasets. The results of significance test verified the effectiveness of LISR and its superiority to the state-of-the-art noise label learning algorithms.

References

[1]

Zhou Z.-H., A brief introduction to weakly supervised learning, Natl. Sci. Rev. 5 (1) (2017) 44–53.

[2]

Yang Y., Nie F., Xu D., Luo J., Zhuang Y., Pan Y., A multimedia retrieval framework based on semi-supervised ranking and relevance feedback, IEEE Trans. Pattern Anal. Mach. Intell. 34 (4) (2012) 723–742.

[3]

X. Zhu, Semi-supervised learning literature survey, University of Wisconsin-Madison.

[4]

Ghosh A., Manwani N., Sastry P.S., Making risk minimization tolerant to label noise, Neurocomputing 160 (2015) 93–107.

[5]

Manwani N., Sastry P.S., Noise tolerance under risk minimization, IEEE Trans. Syst. Man Cybern. 43 (3) (2013) 1146–1151.

[6]

Liu T., Tao D., Classification with noisy labels by importance reweighting, IEEE Trans. Pattern Anal. Mach. Intell. 38 (3) (2015) 447–461.

[7]

Friedman J.H., Hastie T., Tibshirani R., Additive logistic regression: A statistical view of boosting, Ann. Statist. 28 (2) (2000) 337–407.

[8]

G. Patrini, A. Rozza, A.K. Menon, R. Nock, L. Qu, Making deep neural networks robust to label noise: A loss correction approach, in: CVPR, 2017.

[9]

Frenay B., Verleysen M., Classification in the presence of label noise: A survey, IEEE Trans. Neural Netw. 25 (5) (2014) 845–869.

[10]

Luengo J., Shim S.-O., Alshomrani S., Altalhi A.H., Herrera F., Cnc-nos: Class noise cleaning by ensemble filtering and noise scoring, Knowl. Based Syst. 140 (2018) 27–49.

[11]

J. Lu, Z. Zhou, T. Leung, L.-J. Li, F.-F. Li, Mentornet: Regularizing very deep neural networks on corrupted labels, in: ICML, 2017.

[12]

M. Ren, W. Zeng, B. Yang, R. Urtasun, Learning to reweight examples for robust deep learning, in: ICML, 2018.

[13]

Angluin D., Laird P., Learning from noisy examples, Mach. Learn. 2 (4) (1988) 343–370.

Digital Library

[14]

N. Charoenphakdee, J. Lee, M. Sugiyama, On symmetric losses for learning from corrupted labels, in: ICML, 2019.

[15]

Bartlett P.L., Jordan M.I., Mcauliffe J., Convexity, classification, and risk bounds, J. Amer. Statist. Assoc. 101 (473) (2006) 138–156.

[16]

Z. Zhang, M. Sabuncu, Generalized cross entropy loss for training deep neural networks with noisy labels, in: NIPS, 2018.

[17]

N. Natarajan, I.S. Dhillon, P. Ravikumar, A. Tewari, Learning with noisy labels, in: NIPS, 2013.

[18]

C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals, Understanding deep learning requires rethinking generalization, in: ICLR, 2017.

[19]

Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R., Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15 (1) (2014) 1929–1958.

Digital Library

[20]

I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, in: ICLR, vol. 1050, 2015.

[21]

H. Zhang, M. Cisse, Y.N. Dauphin, D. Lopez-Paz*, Mixup: Beyond empirical risk minimization, in: ICLR, 2018.

[22]

G. Pereyra, G. Tucker, J. Chorowski, Ł. Kaiser, G. Hinton, Regularizing neural networks by penalizing confident output distributions, in: ICLR, 2017.

[23]

A. Vahdat, Toward robustness against label noise in training deep discriminative neural networks, in: NIPS, 2017.

[24]

K.H. Lee, X. He, L. Zhang, L. Yang, Cleannet: transfer learning for scalable image classifier training with label noise, in: CVPR, 2018.

[25]

T. Xiao, T. Xia, Y. Yang, C. Huang, X. Wang, Learning from massive noisy labeled data for image classification, in: CVPR, 2015.

[26]

I. Jindal, M. Nokleby, X. Chen, Learning deep networks from noisy labels with dropout regularization, in: ICDM, 2016.

[27]

A. Khetan, Z.C. Lipton, A. Anandkumar, Learning from noisy singly-labeled data, in: ICLR, 2018.

[28]

X. Yu, B. Han, J. Yao, G. Niu, I.W. Tsang, M. Sugiyama, How does disagreement help generalization against label corruption? in: ICML, 2019.

[29]

L. Jiang, Z. Zhou, T. Leung, L.-J. Li, F.-F. Li, Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels, in: ICML, 2018.

[30]

B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I.W. Tsang, M. Sugiyama, Co-teaching: Robust training of deep neural networks with extremely noisy labels, in: NIPS, 2018.

[31]

L. Jiang, D. Huang, M. Liu, W. Yang, Beyond synthetic noise: Deep learning on controlled noisy labels, in: ICML, PMLR, 2020.

[32]

A. Ghosh, H. Kumar, P.S. Sastry, Robust loss functions under label noise for deep neural networks, in: AAAI, 2017.

[33]

Miyato T., Maeda S.-I., Koyama M., Ishii S., Virtual adversarial training: A regularization method for supervised and semi-supervised learning, IEEE Trans. Pattern Anal. Mach. Intell. 41 (8) (2018) 1979–1993.

[34]

X. Wang, S. Wang, J. Wang, H. Shi, T. Mei, Co-mining: Deep face recognition with noisy labels, in: ICCV, 2020.

[35]

J. Li, R. Socher, S.C. Hoi, Dividemix: Learning with noisy labels as semi-supervised learning, in: ICLR, 2020.

[36]

Z. Wang, J. Jiang, B. Han, L. Feng, B. An, G. Niu, G. Long, Seminll: A framework of noisy-label learning by semi-supervised learning, arXiv preprint arXiv:2012.00925.

[37]

Z. Wang, G. Hu, Q. Hu, Training noise-robust deep neural networks via meta-learning, in: CVPR, 2020.

[38]

G. Zheng, A.H. Awadallah, S. Dumais, Meta label correction for noisy label learning, in: AAAI, 2021.

[39]

K. Nishi, Y. Ding, A. Rich, T. Höllerer, Augmentation strategies for learning with noisy labels, in: CVPR, 2021.

[40]

G. Algan, I. Ulusoy, Label noise types and their effects on deep learning, arXiv preprint arXiv:2003.10471.

[41]

W. Li, L. Wang, W. Li, E. Agustsson, L. Van Gool, Webvision database: Visual learning and understanding from web data, arXiv preprint arXiv:1708.02862.

[42]

Y. Kim, J. Yim, J. Yun, J. Kim, Nlnl: Negative learning for noisy labels, in: ICCV, 2019.

[43]

Han J.-W., Kamber M., Data Mining Concept and Techniques, Elsevier, 2006.

[44]

Steinberg D., CART: Classification and regression trees, in: The Top Ten Tlgorithms in Tata Mining, Chapman and Hall/CRC, 2009, pp. 193–216.

[45]

Quinlan J.R., Induction of decision trees, Mach. Learn. 1 (1) (1986) 81–106.

[46]

Reyes O., Altalhi A.H., Ventura S., Statistical comparisons of active learning strategies over multiple datasets, Knowl. Based Syst. 145 (2018) 274–288.

[47]

Tu B., Zhang X.-F., Kang X.-D., Zhang G.-Y., Li S.-T., Density peak-based noisy label detection for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens. 57 (3) (2019) 1573–1584.

[48]

Breve F.A., Zhao L., Quiles M.G., Particle competition and cooperation for semi-supervised learning with label noise, Neurocomputing 160 (2015) 63–72.

[49]

G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, L. Qu, Making deep neural networks robust to label noise: a loss correction approach, in: CVPR, 2017.

Cited By

Wu CKao SHong RChen L(2024)Profiling effects of filtering noise labels on learning performanceKnowledge-Based Systems10.1016/j.knosys.2024.111667294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111667

Index Terms

Noise label learning through label confidence statistical inference

Index terms have been assigned to the content through auto-classification.

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Confidence-rated discriminative partial label learning
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

Partial label learning aims to induce a multi-class classifier from training examples where each of them is associated with a set of candidate labels, among which only one label is valid. The common discriminative solution to learn from partial label ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Knowledge-Based Systems

Knowledge-Based Systems Volume 227, Issue C

Sep 2021

845 pages

ISSN:0950-7051

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 05 September 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu CKao SHong RChen L(2024)Profiling effects of filtering noise labels on learning performanceKnowledge-Based Systems10.1016/j.knosys.2024.111667294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111667

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents