Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering
<p>Sensitivity of <math display="inline"><semantics> <mi>λ</mi> </semantics></math>. We show the evolution of test accuracy during training with varying values of <math display="inline"><semantics> <mi>λ</mi> </semantics></math>.</p> "> Figure 2
<p>Sensitivity of <span class="html-italic">K</span>. (<b>a</b>–<b>d</b>) show the evolution of test accuracy, while (<b>e</b>,<b>f</b>) show the evolution of purity on the training set.</p> "> Figure 3
<p>Ablation study. (<b>a</b>,<b>b</b>) show the evolution of test accuracy, while (<b>c</b>,<b>d</b>) show the evolution of purity on the training set.</p> "> Figure 4
<p>t-SNE Visualization of learned representations on the CIFAR-10 training set with 0.8 symmetric noise. Each color represents a distinct class, and all points are colored according to clean labels.</p> ">
:1. Introduction
2. Related Works
2.1. Learning with Noisy Labels
2.2. Contrastive Learning
3. Method
3.1. Problem Formulation
3.2. Twin Contrastive Clustering
3.3. Injecting Label Information to TCC
3.4. Prediction Consistency Regularization Based on Clustering
Algorithm 1: Training Algorithm |
4. Experiment
4.1. Evaluation on Synthetic Noise
4.2. Evaluation on Real-World Noise
4.3. Sensitivity of Hyperparameters
4.4. Ablation Study
4.5. Representations Evaluation
4.6. Training Time Analysis
5. Discussion
Author Contributions
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Derivation of ELBO
Appendix B. Calculation of the Expectation Term
- Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef]
- Razno, M. Machine learning text classification model with NLP approach. Comput. Linguist. Intell. Syst. 2019, 2, 71–73. [Google Scholar]
- Zhang, L.; Lu, L.; Nogues, I.; Summers, R.M.; Liu, S.; Yao, J. DeepPap: Deep convolutional networks for cervical cell classification. IEEE J. Biomed. Health Inform. 2017, 21, 1633–1643. [Google Scholar] [CrossRef]
- Frénay, B.; Verleysen, M. Classification in the presence of label noise: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 845–869. [Google Scholar] [CrossRef]
- Jiang, L.; Huang, D.; Liu, M.; Yang, W. Beyond synthetic noise: Deep learning on controlled noisy labels. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 4804–4815. [Google Scholar]
- Yi, G.Y. Statistical Analysis with Measurement Error or Misclassification: Strategy, Method and Application; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Englesson, E.; Azizpour, H. Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–14 December 2021; Volume 34, pp. 30284–30297. [Google Scholar]
- Iscen, A.; Valmadre, J.; Arnab, A.; Schmid, C. Learning with neighbor consistency for noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 4672–4681. [Google Scholar]
- He, K.; Fan, H.; Wu, Y.; Xie, S.; Girshick, R. Momentum Contrast for Unsupervised Visual Representation Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 9726–9735. [Google Scholar] [CrossRef]
- Caron, M.; Misra, I.; Mairal, J.; Goyal, P.; Bojanowski, P.; Joulin, A. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 9912–9924. [Google Scholar]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 1597–1607. [Google Scholar]
- Shen, Y.; Shen, Z.; Wang, M.; Qin, J.; Torr, P.; Shao, L. You never cluster alone. Adv. Neural Inf. Process. Syst. 2021, 34, 27734–27746. [Google Scholar]
- Ghosh, A.; Kumar, H.; Sastry, P.S. Robust loss functions under label noise for deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Wang, X.; Hua, Y.; Kodirov, E.; Clifton, D.A.; Robertson, N.M. IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude’s Variance Matters. In Proceedings of the ICLR 2023 Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models, Hybrid Mode (in-Person and Virtual Attendance), 4 May 2023; Available online: (accessed on 28 March 2024).
- Liu, D.; Zhao, J.; Wu, J.; Yang, G.; Lv, F. Multi-category classification with label noise by robust binary loss. Neurocomputing 2022, 482, 14–26. [Google Scholar] [CrossRef]
- Zhang, Z.; Sabuncu, M. Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Wang, Y.; Ma, X.; Chen, Z.; Luo, Y.; Yi, J.; Bailey, J. Symmetric Cross Entropy for Robust Learning With Noisy Labels. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 322–330. [Google Scholar] [CrossRef]
- Liu, S.; Niles-Weed, J.; Razavian, N.; Fernandez-Granda, C. Early-Learning Regularization Prevents Memorization of Noisy Labels. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 20331–20342. [Google Scholar]
- Li, M.; Soltanolkotabi, M.; Oymak, S. Gradient descent with early stopping is provably robust to label noise for overparameterized neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Online, 26–28 August 2020; pp. 4313–4324. [Google Scholar]
- Malach, E.; Shalev-Shwartz, S. Decoupling “When to Update” from “How to Update”. Available online: (accessed on 28 March 2024).
- Han, B.; Yao, Q.; Yu, X.; Niu, G.; Xu, M.; Hu, W.; Tsang, I.; Sugiyama, M. Co-Teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels. Available online: (accessed on 28 March 2024).
- Yu, X.; Han, B.; Yao, J.; Niu, G.; Tsang, I.; Sugiyama, M. How does disagreement help generalization against label corruption? In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 7164–7173. [Google Scholar]
- Wei, H.; Feng, L.; Chen, X.; An, B. Combating noisy labels by agreement: A joint training method with co-regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 13726–13735. [Google Scholar]
- Sarfraz, F.; Arani, E.; Zonooz, B. Noisy concurrent training for efficient learning under label noise. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3159–3168. [Google Scholar]
- Tan, C.; Xia, J.; Wu, L.; Li, S.Z. Co-learning: Learning from noisy labels with self-supervision. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 1405–1413. [Google Scholar]
- Grill, J.B.; Strub, F.; Altché, F.; Tallec, C.; Richemond, P.; Buchatskaya, E.; Doersch, C.; Avila Pires, B.; Guo, Z.; Gheshlaghi Azar, M.; et al. Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 2020, 33, 21271–21284. [Google Scholar]
- Chen, X.; He, K. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15750–15758. [Google Scholar]
- Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep Clustering for Unsupervised Learning of Visual Features. In Computer Vision—ECCV 2018; Springer International Publishing: Cham, Switzerland, 2018; Volume 11218, pp. 139–156. [Google Scholar] [CrossRef]
- Li, J.; Zhou, P.; Xiong, C.; Hoi, S. Prototypical Contrastive Learning of Unsupervised Representations. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Zheltonozhskii, E.; Baskin, C.; Mendelson, A.; Bronstein, A.M.; Litany, O. Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2022; pp. 387–397. [Google Scholar] [CrossRef]
- Ghosh, A.; Lan, A. Contrastive Learning Improves Model Robustness Under Label Noise. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 19–25 June 2021; pp. 2697–2702. [Google Scholar] [CrossRef]
- Ortego, D.; Arazo, E.; Albert, P.; O’Connor, N.E.; McGuinness, K. Multi-Objective Interpolation Training for Robustness to Label Noise. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 6602–6611. [Google Scholar] [CrossRef]
- Li, S.; Xia, X.; Ge, S.; Liu, T. Selective-supervised contrastive learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 316–325. [Google Scholar]
- Li, J.; Xiong, C.; Hoi, S.C. MoPro: Webly Supervised Learning with Momentum Prototypes. In Proceedings of the International Conference on Learning Representations, Virtual, 3–7 May 2021. [Google Scholar]
- Li, J.; Xiong, C.; Hoi, S.C. Learning from Noisy Data with Robust Representation Learning. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 9465–9474. [Google Scholar] [CrossRef]
- Huang, Z.; Zhang, J.; Shan, H. Twin Contrastive Learning with Noisy Labels. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 11661–11670. [Google Scholar] [CrossRef]
- Yi, L.; Liu, S.; She, Q.; McLeod, A.I.; Wang, B. On learning contrastive representations for learning with noisy labels. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 16682–16691. [Google Scholar]
- Ericsson, L.; Gouk, H.; Loy, C.C.; Hospedales, T.M. Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Process. Mag. 2022, 39, 42–62. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Jang, E.; Gu, S.; Poole, B. Categorical Reparameterization with Gumbel-Softmax. In Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Available online: (accessed on 28 March 2024).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Patrini, G.; Rozza, A.; Krishna Menon, A.; Nock, R.; Qu, L. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1944–1952. [Google Scholar]
- Song, H.; Kim, M.; Lee, J.G. Selfie: Refurbishing unclean samples for robust deep learning. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 5907–5915. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Chen, Y.; Hu, Y.; Hu, X.; Feng, C.; Chen, M. CoGO: A contrastive learning framework to predict disease similarity based on gene network and ontology structure. Bioinformatics 2022, 38, 4380–4386. [Google Scholar] [CrossRef]
- Zheng, L.; Liu, Z.; Yang, Y.; Shen, H.B. Accurate inference of gene regulatory interactions from spatial gene expression with deep contrastive learning. Bioinformatics 2022, 38, 746–753. [Google Scholar] [CrossRef] [PubMed]
- Cai, D.; Sun, C.; Song, M.; Zhang, B.; Hong, S.; Li, H. Hypergraph contrastive learning for electronic health records. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), Alexandria, VA, USA, 28–30 April 2022; pp. 127–135. [Google Scholar]
Method | Sym. Noise Rate | Asy. Noise Rate | |||||
0.2 | 0.4 | 0.6 | 0.8 | 0.2 | 0.3 | 0.4 | |
CE | |||||||
Forward | |||||||
GCE | |||||||
SCE | |||||||
ELR | |||||||
GJS | |||||||
Co-learning | |||||||
TPCR | |||||||
TPCR(f) |
Method | Sym. Noise Rate | Asy. Noise Rate | |||||
0.2 | 0.4 | 0.6 | 0.8 | 0.2 | 0.3 | 0.4 | |
Standard CE | |||||||
Forward | |||||||
GCE | |||||||
SCE | |||||||
ELR | |||||||
GJS | |||||||
Co-learning | |||||||
TPCR | |||||||
TPCR(f) |
Cross Entropy | Decoupling | Co-Teaching | Co-Teaching+ | JoCoR | Co-Learning | TPCR | TPCR(f) |
Label | Methods | ||||||
y | ELR | 74.51 | 75.50 | 75.30 | 75.18 | 74.87 | 74.30 |
GJS | 78.36 | 79.30 | 79.68 | 79.60 | 79.68 | 79.42 | |
Co-learning | 81.42 | 82.01 | 81.53 | 80.88 | 80.09 | 79.07 | |
TPCR | 85.27 | 85.24 | 85.27 | 85.08 | 84.97 | 84.56 | |
ELR | 73.50 | 73.57 | 73.55 | 73.66 | 73.60 | 73.65 | |
GJS | 78.44 | 78.52 | 78.80 | 78.82 | 78.88 | 78.72 | |
Co-learning | 76.93 | 77.68 | 78.25 | 78.06 | 77.78 | 77.27 | |
TPCR | 84.52 | 84.55 | 84.76 | 84.63 | 84.45 | 84.18 | |
ELR | 32.97 | 42.72 | 69.00 | 72.75 | 73.42 | 73.86 | |
GJS | 30.20 | 42.10 | 71.84 | 76.94 | 78.82 | 79.32 | |
Co-learning | 32.57 | 41.97 | 71.53 | 76.57 | 78.43 | 78.61 | |
TPCR | 36.74 | 49.03 | 79.95 | 83.49 | 84.36 | 84.21 |
ELR | GJS | Co-Learning | TPCR |
h | h | h | h |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
Share and Cite
Sun, X.; Zhang, S.; Ma, S. Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering. Entropy 2024, 26, 308.
Sun X, Zhang S, Ma S. Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering. Entropy. 2024; 26(4):308.
Chicago/Turabian StyleSun, Xinkai, Sanguo Zhang, and Shuangge Ma. 2024. "Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering" Entropy 26, no. 4: 308.
APA StyleSun, X., Zhang, S., & Ma, S. (2024). Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering. Entropy, 26(4), 308.