Abstract
Up to now, the inaccurate supervision problem caused by label noises poses a big challenge for regression modeling. Regularized noise-robust models provide a valid way for dealing with label noises in regression tasks. They generally use robust losses to cope with label noises and further enhance model robustness by feature selection. But most of them may not work well on data sets contaminated by severe noises (whose magnitudes are extreme), because severe noises do not coincide with their noise assumptions. To address this concern, this paper proposes a robust adaptive linear regression method named TC-ALASSO (Truncated Cauchy Adaptive LASSO), in which model learning and feature selection are finished simultaneously. The fat-tailed Cauchy distribution and truncation theory are adopted to deal with moderate noises and identified extreme noises, respectively, and construct the Truncated Cauchy loss for regression tasks. Moreover, TC-ALASSO applies the adaptive regularizer to finish feature selection well. Note that its adaptive regularizer weights are acquired according to regression coefficient estimations under the truncated Cauchy loss. We also theoretically analyze the robustness of proposed TC-ALASSO in this paper. The experimental results on artificial and benchmark data sets all confirm the robustness and effectiveness of TC-ALASSO. In addition, experimental results on face recognition databases validate the performance advantage of TC-ALASSO over state-of-the-art methods in dealing with extreme illumination variations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Unbiasedness: the resulting estimator should have low bias when estimating large regression coefficients \(\beta _j\).
Feature selection consistency: \(\underset{n\rightarrow \infty }{\text {lim}}\ P(\text {supp}(\tilde{\varvec{\beta }})=\text {supp}(\varvec{\beta }))\rightarrow 1\), where \(\tilde{\varvec{\beta }}\) is the estimation of regression coefficient vector \(\varvec{\beta }\).
References
Zhou Z (2018) A brief introduction to weakly supervised learning. Natl Sci Rev 5(1):44–53
Gupta D, Gupta U (2021) On robust asymmetric lagrangian \(\nu -\)twin support vector regression using pinball loss function. Appl Soft Comput 102:107099
Jiang G, Wang W, Qian Y, Liang J (2021) A unified sample selection framework for output noise filtering: an error-bound perspective. J Mach Learn Res 22(18):1–66
Zhang C, Li H, Chen C, Qian Y, Zhou X (2022) Enhanced group sparse regularized nonconvex regression for face recognition. IEEE Trans Pattern Anal Mach Intell 44(5):2438–2452
Zhou Z (2016) Machine learning. Tsinghua University Press, Beijing
You D, Zhang J, Xie J, Chen B, Ma S (2021) COAST: COntrollable atbitrary-sampling network for compressive sensing. IEEE Trans Image Process 30:6066–6080
Yamada M et al (2018) Ultra high-dimensional nonlinear feature selection for big biological data. IEEE Trans Knowl Data Eng 30(7):1352–1365
Guan N, Liu T, Zhang Y, Tao D, Davis LS (2019) Truncated cauchy non-negative matrix factorization. IEEE Trans Pattern Anal Mach Intell 41(1):246–259
Xu Y, Zhu S, Yang S, Zhang C, Jin R, Yang T (2020) Learning with non-convex truncated losses by SGD. In: Proceedings of the 35th uncertainty in artificial intelligence conference, pp 701–711
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58(1):267–288
Hazarika BB, Gupta D, Borah P (2021) An intuitionistic fuzzy kernel ridge regression classifier for binary classification. Appl Soft Comput 112:107816
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Li X, Wang Y, Ruiz R (2022) A survey on sparse learning models for feature selection. IEEE Trans Cybernet 52(3):1642–1660
Wang L (2013) \({L}_1\) penalized LAD estimator for high dimensional linear regression. J Multivar Anal 120(9):135–151
Chen X, Wang ZJ, Mckeown MJ (2010) Asymptotic analysis of robust lassos in the presence of noise with large variance. IEEE Trans Inf Theory 56(10):5131–5149
Nie F, Hu Z, Li X (2018) An investigation for loss functions widely used in machine learning. Commun Inf Syst 18(1):37–52
Gu Y, Fan J, Kong L, Ma S, Zou H (2018) ADMM for high-dimensional sparse penalized quantile regression. Technometrics 60(3):319–331
Gu Y, Zou H (2020) Sparse composite quantile regression in ultrahigh dimensions with tuning parameter calibration. IEEE Trans Inf Theory 66(11):7132–7154
Guo Y, Wang W, Wang X (2023) A robust linear regression feature selection method for data sets with unknown noise. IEEE Trans Knowl Data Eng 35(1):31–44
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Nagy F (2006) Parameter estimation of the cauchy distribution in information theory approach. J Univ Comput Sci 12(9):1332–1344
Geman D, Yang C (1995) Nonlinear image recovery with half-quadratic regularization. IEEE Trans Image Process 4(7):932–946
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Dua D, Graff C (2017) UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml.
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:1–27
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Val Logic Soft Comput 17(2–3):255–287
Scheetz TE, et al (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. In: Proceedings of the national academy of sciences of the United States of America, vol 103, pp 14429–14434
Meng D, Torre FDL (2013) Robust matrix factorization with unknown noise. In: Proceedings of the IEEE international conference on computer vision, pp 1337–1344
Naseem I, Togneri R, Bennamoun M (2010) Linear regression for face recognition. IEEE Trans Pattern Anal Mach Intell 32(11):2106–2112
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation which helps face recognition? In: Proceedings of the IEEE international conference on computer vision, pp 471–478
Jia K, Chan T-H, Ma Y (2012) Robust and practical face recognition via structured sparsity. In: Proceedings of the European conference on computer vision, pp 331–344
Naseem I, Togneri R, Bennamoun M (2012) Robust regression for face recognition. Pattern Recogn 45(1):104–118
He R, Zheng W-S, Hu B-G (2011) Maximum correntropy criterion for robust face recognition. IEEE Trans Pattern Anal Mach Intell 33(8):1561–1576
Yang M, Zhang L, Yang J, Zhang D (2011) Robust sparse coding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 625–632
Li X-X, Dai D-Q, Zhang X-F, Ren C-X (2013) Structured sparse error coding for face recognition with occlusion. IEEE Trans Image Process 22(5):1889–1999
He R, Zheng W-S, Tan T, Sun Z (2014) Half-quadratic based iterative minimization for robust sparse representation. IEEE Trans Pattern Anal Mach Intell 36(2):261–275
Yang J, Luo L, Qian J, Tai Y, Zhang F, Xu Y (2017) Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes. IEEE Trans Pattern Anal Mach Intell 39(1):156–171
Acknowledgements
The authors greatly thank Jian Yang with Nanjing University of Science and Technology for the provided database and source code of his paper. We also thank Yudong Liang from Shanxi University for provided FaceNet experimental results. This work was supported by the National Natural Science Foundation of China (Nos. U21A20513, 62076154, U1805263), the Central Government Guides Local Science and Technology Innovation Projects under Grant (YDZX20201400001224) and the Key R &D Program of Shanxi Province (202202020101003).
Author information
Authors and Affiliations
Contributions
Yaqing Guo wrote the manuscript text under the guidance and help of professor Wenjian Wang. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, Y., Wang, W. A robust adaptive linear regression method for severe noise. Knowl Inf Syst 65, 4613–4653 (2023). https://doi.org/10.1007/s10115-023-01924-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01924-4