Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Noise label learning through label confidence statistical inference

Published: 05 September 2021 Publication History

Abstract

Noise label exists widely in real-world data, resulting in the degradation of classification performance. Popular methods require a known noise distribution or additional cleaning supervision, which is usually unavailable in practical scenarios. This paper presents a theoretical statistical method and designs a label confidence inference (LISR) algorithm to handle this issue. For data distribution, we define a statistical function for label inconsistency and analyze its relationship with neighbor radius. For data representation, we define trusted-neighbor, nearest-trusted-neighbor and untrusted-neighbor. For noisy label recognition, we present three inference methods to predict the labels and their confidence. The LISR algorithm establishes a practical statistical model, queries the initial trusted instances, iteratively searches for the trusted instances and corrects labels. We conducted experiments on synthetic, UCI and classic image datasets. The results of significance test verified the effectiveness of LISR and its superiority to the state-of-the-art noise label learning algorithms.

References

[1]
Zhou Z.-H., A brief introduction to weakly supervised learning, Natl. Sci. Rev. 5 (1) (2017) 44–53.
[2]
Yang Y., Nie F., Xu D., Luo J., Zhuang Y., Pan Y., A multimedia retrieval framework based on semi-supervised ranking and relevance feedback, IEEE Trans. Pattern Anal. Mach. Intell. 34 (4) (2012) 723–742.
[3]
X. Zhu, Semi-supervised learning literature survey, University of Wisconsin-Madison.
[4]
Ghosh A., Manwani N., Sastry P.S., Making risk minimization tolerant to label noise, Neurocomputing 160 (2015) 93–107.
[5]
Manwani N., Sastry P.S., Noise tolerance under risk minimization, IEEE Trans. Syst. Man Cybern. 43 (3) (2013) 1146–1151.
[6]
Liu T., Tao D., Classification with noisy labels by importance reweighting, IEEE Trans. Pattern Anal. Mach. Intell. 38 (3) (2015) 447–461.
[7]
Friedman J.H., Hastie T., Tibshirani R., Additive logistic regression: A statistical view of boosting, Ann. Statist. 28 (2) (2000) 337–407.
[8]
G. Patrini, A. Rozza, A.K. Menon, R. Nock, L. Qu, Making deep neural networks robust to label noise: A loss correction approach, in: CVPR, 2017.
[9]
Frenay B., Verleysen M., Classification in the presence of label noise: A survey, IEEE Trans. Neural Netw. 25 (5) (2014) 845–869.
[10]
Luengo J., Shim S.-O., Alshomrani S., Altalhi A.H., Herrera F., Cnc-nos: Class noise cleaning by ensemble filtering and noise scoring, Knowl. Based Syst. 140 (2018) 27–49.
[11]
J. Lu, Z. Zhou, T. Leung, L.-J. Li, F.-F. Li, Mentornet: Regularizing very deep neural networks on corrupted labels, in: ICML, 2017.
[12]
M. Ren, W. Zeng, B. Yang, R. Urtasun, Learning to reweight examples for robust deep learning, in: ICML, 2018.
[13]
Angluin D., Laird P., Learning from noisy examples, Mach. Learn. 2 (4) (1988) 343–370.
[14]
N. Charoenphakdee, J. Lee, M. Sugiyama, On symmetric losses for learning from corrupted labels, in: ICML, 2019.
[15]
Bartlett P.L., Jordan M.I., Mcauliffe J., Convexity, classification, and risk bounds, J. Amer. Statist. Assoc. 101 (473) (2006) 138–156.
[16]
Z. Zhang, M. Sabuncu, Generalized cross entropy loss for training deep neural networks with noisy labels, in: NIPS, 2018.
[17]
N. Natarajan, I.S. Dhillon, P. Ravikumar, A. Tewari, Learning with noisy labels, in: NIPS, 2013.
[18]
C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals, Understanding deep learning requires rethinking generalization, in: ICLR, 2017.
[19]
Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R., Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15 (1) (2014) 1929–1958.
[20]
I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, in: ICLR, vol. 1050, 2015.
[21]
H. Zhang, M. Cisse, Y.N. Dauphin, D. Lopez-Paz*, Mixup: Beyond empirical risk minimization, in: ICLR, 2018.
[22]
G. Pereyra, G. Tucker, J. Chorowski, Ł. Kaiser, G. Hinton, Regularizing neural networks by penalizing confident output distributions, in: ICLR, 2017.
[23]
A. Vahdat, Toward robustness against label noise in training deep discriminative neural networks, in: NIPS, 2017.
[24]
K.H. Lee, X. He, L. Zhang, L. Yang, Cleannet: transfer learning for scalable image classifier training with label noise, in: CVPR, 2018.
[25]
T. Xiao, T. Xia, Y. Yang, C. Huang, X. Wang, Learning from massive noisy labeled data for image classification, in: CVPR, 2015.
[26]
I. Jindal, M. Nokleby, X. Chen, Learning deep networks from noisy labels with dropout regularization, in: ICDM, 2016.
[27]
A. Khetan, Z.C. Lipton, A. Anandkumar, Learning from noisy singly-labeled data, in: ICLR, 2018.
[28]
X. Yu, B. Han, J. Yao, G. Niu, I.W. Tsang, M. Sugiyama, How does disagreement help generalization against label corruption? in: ICML, 2019.
[29]
L. Jiang, Z. Zhou, T. Leung, L.-J. Li, F.-F. Li, Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels, in: ICML, 2018.
[30]
B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I.W. Tsang, M. Sugiyama, Co-teaching: Robust training of deep neural networks with extremely noisy labels, in: NIPS, 2018.
[31]
L. Jiang, D. Huang, M. Liu, W. Yang, Beyond synthetic noise: Deep learning on controlled noisy labels, in: ICML, PMLR, 2020.
[32]
A. Ghosh, H. Kumar, P.S. Sastry, Robust loss functions under label noise for deep neural networks, in: AAAI, 2017.
[33]
Miyato T., Maeda S.-I., Koyama M., Ishii S., Virtual adversarial training: A regularization method for supervised and semi-supervised learning, IEEE Trans. Pattern Anal. Mach. Intell. 41 (8) (2018) 1979–1993.
[34]
X. Wang, S. Wang, J. Wang, H. Shi, T. Mei, Co-mining: Deep face recognition with noisy labels, in: ICCV, 2020.
[35]
J. Li, R. Socher, S.C. Hoi, Dividemix: Learning with noisy labels as semi-supervised learning, in: ICLR, 2020.
[36]
Z. Wang, J. Jiang, B. Han, L. Feng, B. An, G. Niu, G. Long, Seminll: A framework of noisy-label learning by semi-supervised learning, arXiv preprint arXiv:2012.00925.
[37]
Z. Wang, G. Hu, Q. Hu, Training noise-robust deep neural networks via meta-learning, in: CVPR, 2020.
[38]
G. Zheng, A.H. Awadallah, S. Dumais, Meta label correction for noisy label learning, in: AAAI, 2021.
[39]
K. Nishi, Y. Ding, A. Rich, T. Höllerer, Augmentation strategies for learning with noisy labels, in: CVPR, 2021.
[40]
G. Algan, I. Ulusoy, Label noise types and their effects on deep learning, arXiv preprint arXiv:2003.10471.
[41]
W. Li, L. Wang, W. Li, E. Agustsson, L. Van Gool, Webvision database: Visual learning and understanding from web data, arXiv preprint arXiv:1708.02862.
[42]
Y. Kim, J. Yim, J. Yun, J. Kim, Nlnl: Negative learning for noisy labels, in: ICCV, 2019.
[43]
Han J.-W., Kamber M., Data Mining Concept and Techniques, Elsevier, 2006.
[44]
Steinberg D., CART: Classification and regression trees, in: The Top Ten Tlgorithms in Tata Mining, Chapman and Hall/CRC, 2009, pp. 193–216.
[45]
Quinlan J.R., Induction of decision trees, Mach. Learn. 1 (1) (1986) 81–106.
[46]
Reyes O., Altalhi A.H., Ventura S., Statistical comparisons of active learning strategies over multiple datasets, Knowl. Based Syst. 145 (2018) 274–288.
[47]
Tu B., Zhang X.-F., Kang X.-D., Zhang G.-Y., Li S.-T., Density peak-based noisy label detection for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens. 57 (3) (2019) 1573–1584.
[48]
Breve F.A., Zhao L., Quiles M.G., Particle competition and cooperation for semi-supervised learning with label noise, Neurocomputing 160 (2015) 63–72.
[49]
G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, L. Qu, Making deep neural networks robust to label noise: a loss correction approach, in: CVPR, 2017.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Knowledge-Based Systems
Knowledge-Based Systems  Volume 227, Issue C
Sep 2021
845 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 05 September 2021

Author Tags

  1. Confidence prediction
  2. Noise label
  3. Label inconsistency
  4. Statistical inference

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media