research-article

A robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k

Authors:

Himmet Toprak Kesgin,

Mehmet Fatih AmasyaliAuthors Info & Claims

Frontiers of Computer Science, Volume 18, Issue 4

https://doi.org/10.1007/s11704-023-2430-4

Published: 16 December 2023 Publication History

Abstract

The use of all samples in the optimization process does not produce robust results in datasets with label noise. Because the gradients calculated according to the losses of the noisy samples cause the optimization process to go in the wrong direction. In this paper, we recommend using samples with loss less than a threshold determined during the optimization, instead of using all samples in the mini-batch. Our proposed method, Adaptive-k, aims to exclude label noise samples from the optimization process and make the process robust. On noisy datasets, we found that using a threshold-based approach, such as Adaptive-k, produces better results than using all samples or a fixed number of low-loss samples in the mini-batch. On the basis of our theoretical analysis and experimental results, we show that the Adaptive-k method is closest to the performance of the Oracle, in which noisy samples are entirely removed from the dataset. Adaptive-k is a simple but effective method. It does not require prior knowledge of the noise ratio of the dataset, does not require additional model training, and does not increase training time significantly. In the experiments, we also show that Adaptive-k is compatible with different optimizers such as SGD, SGDM, and Adam. The code for Adaptive-k is available at GitHub.

References

[1]

Zhang C, Bengio S, Hardt M, Recht B, and Vinyals O Understanding deep learning (still) requires rethinking generalization Communications of the ACM 2021 64 3 107-115

[2]

Liao S, Jiang X, and Ge Z Weakly supervised multilayer perceptron for industrial fault classification with inaccurate and incomplete labels IEEE Transactions on Automation Science and Engineering 2022 19 2 1192-1201

[3]

Ortego D, Arazo E, Albert P, O’Connor N E, McGuinness K. Towards robust learning with different label noise distributions. In: Proceedings of the 25th International Conference on Pattern Recognition (ICPR). 2021, 7020–7027

[4]

Arazo E, Ortego D, Albert P, O’Connor N, McGuinness K. Unsupervised label noise modeling and loss correction. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 312–321

[5]

Nishi K, Ding Y, Rich A, Höllerer T. Augmentation strategies for learning with noisy labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 8018–8027

[6]

Majidi N, Amid E, Talebi H, Warmuth M K. Exponentiated gradient reweighting for robust training under label noise and beyond. 2021, arXiv preprint arXiv: 2104.01493

[7]

Shah V, Wu X, Sanghavi S. Choosing the sample with lowest loss makes SGD robust. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics. 2020, 2120–2130

[8]

Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning. 2009, 41–48

[9]

Kesgin H T, Amasyali M F. Cyclical curriculum learning. 2022, arXiv preprint arXiv: 2202.05531

[10]

Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I W, Sugiyama M. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536–8546

[11]

Shi X and Che W Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis Frontiers of Computer Science 2023 17 5 175333

[12]

Yang H, Jin Y, Li Z, Wang D B, Miao L, Geng X, Zhang M L. Learning from noisy labels via dynamic loss thresholding. 2021, arXiv preprint arXiv: 2104.02570

[13]

Wei Y, Xue M, Liu X, and Xu P Data fusing and joint training for learning with noisy labels Frontiers of Computer Science 2022 16 6 166338

[14]

Yao Q, Yang H, Han B, Niu G, Kwok J T. Searching to exploit memorization effect in learning with noisy labels. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1000

[15]

Chi Y, Li Y, Zhang H, Liang Y. Median-truncated gradient descent: a robust and scalable nonconvex approach for signal estimation. In: Proceedings of the 3rd International MATHEON Conference on Compressed Sensing and Its Applications. 2019, 237–261

[16]

Shen Y, Sanghavi S. Learning with bad training data via iterative trimmed loss minimization. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 5739–5748

[17]

Nakamura K, Hong B W. Regularization in neural network optimization via trimmed stochastic gradient descent with noisy label. 2020, arXiv preprint arXiv: 2012.11073

[18]

Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015

[19]

Deng L The MNIST database of handwritten digit images for machine learning research [best of the web] IEEE Signal Processing Magazine 2012 29 6 141-142

[20]

Xiao H, Rasul K, Vollgraf R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. 2017, arXiv preprint arXiv: 1708.07747

[21]

Krizhevsky A. Learning multiple layers of features from tiny images.Technical Report, 2009

[22]

He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 630–645

[23]

Maas A L, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 142–150

[24]

comet-examples/comet-keras-cnn-lstm-example.py at master • cometml/comet-examples • github. See qwone.com/~jason/20Newsgroups website, 2021

[25]

Misra R, Arora P. Sarcasm detection using hybrid neural network. 2019, arXiv preprint arXiv: 1908.07414

[26]

kaggle. Sarcasm detection: a guide for ML and DL approach. See kaggle.com/subbhashit/sarcasm-detection-a-guide-for-ml-and-dl-approach website. 2021

[27]

Alam M H, Ryu W J, and Lee S Joint multi-grain topic sentiment: modeling semantic aspects for online reviews Information Sciences 2016 339 206-223

[28]

kaggle. Hotel reviews sentiment prediction. See kaggle.com/code/shahraizanwar/hotel-reviews-sentiment-prediction/notebook website. 2021

[29]

Home page for 20 newsgroups data set. See qwone.com/~jason/20Newsgroups website, 2014

[30]

Team K. Using pre-trained word embeddings. See keras.io/examples/nlp/pretrained_word_embeddings website, 2021

Cited By

Wu YHuang HSong YJin H(2025)Soft-GNN: towards robust graph neural networks via self-adaptive data utilizationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3575-519:4Online publication date: 1-Apr-2025
https://dl.acm.org/doi/10.1007/s11704-024-3575-5

Index Terms

A robust optimization method for label noisy datasets based on adaptive threshold: Adaptive-k

Index terms have been assigned to the content through auto-classification.

Recommendations

Noisy Label Learning Based on Weighted Neighborhood Consistency
Web and Big Data
Abstract
In the realm of deep learning applied to real-world scenarios, the existence of noisy labels is an inevitable factor that can detrimentally affect the models’ performance. Most state-of-the-art methods for learning from noisy labels rely on sample ...
Robust Learning by Self-Transition for Handling Noisy Labels
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Real-world data inevitably contains noisy labels, which induce the poor generalization of deep neural networks. It is known that the network typically begins to rapidly memorize false-labeled samples after a certain point of training. Thus, to counter ...
Sample Prior Guided Robust Model Learning to Suppress Noisy Labels
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
Imperfect labels are ubiquitous in real-world datasets and seriously harm the model performance. Several recent effective methods for handling noisy labels have two key steps: 1) dividing samples into cleanly labeled and wrongly labeled sets by ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities

Frontiers of Computer Science: Selected Publications from Chinese Universities Volume 18, Issue 4

Aug 2024

210 pages

EISSN:2095-2236

Issue’s Table of Contents

© Higher Education Press 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 16 December 2023

Accepted: 03 April 2023

Received: 07 July 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu YHuang HSong YJin H(2025)Soft-GNN: towards robust graph neural networks via self-adaptive data utilizationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3575-519:4Online publication date: 1-Apr-2025
https://dl.acm.org/doi/10.1007/s11704-024-3575-5

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents