Abstract
Machine learning methods have been widely used control and information systems. Robust learning is an important issue in machine learning field. In this work, we propose a novel robust regression framework. Specifically, we propose a robust similarity measure induced by correntropy, and its important properties are demonstrated theoretically such as symmetry, boundedness, nonnegativity, consistency, smoothness and approximation behaviors. Moreover, the proposed robust metric extends the traditional metrics such as the \(l_{2}\)-norm and \(l_{0}\)-norm as the kernel parameter approaches different values. Then with the proposed similarity metric and \(\epsilon \)-insensitive pinball loss, a new robust twin support vector regression framework (RTSVR) is proposed to handle robust regression problems. The linear RTSVR model is first built, and a kernelled RTSVR version is developed to deal with nonlinear regressions. To handle the nonconvexity of the proposed RTSVR, we use DC (different of convex function) programming algorithm to iteratively solve the problem, and the resulting algorithm converges linearly. To test the proposed RTSVR, numerical experiments are implemented on two databases including a public benchmark database and a practical application database. Experiments on benchmark data with different types of noise illustrate that the proposed methods achieve better performance than the traditional methods in most cases. Experiments on the application database, the proposed RTSVR is combined with near-infrared (NIR) spectral technique to analyze the hardness of licorice seeds in low frequency,intermediate frequency and high frequency spectral regions respectively. Experiments on different spectral regions show that the performance of the RTSVR is better than that of the traditional methods in all spectral regions.
Similar content being viewed by others
References
Ghosh A, Senthilrajan A (2023) Comparison of machine learning techniques for spam detection. Multimed Tools Appl 82:29227–29254
Wang Y, Hong H, Xin H, Zhai R (2023) A two-stage unsupervised sentiment analysis method. Multimed Tools Appl 82:26527–26544
Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Yuan C, Yang LM (2021) Correntropy-based metric for robust twin support vector machine. Inf Sci 545:82–101
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Hazarika BGD (2022) Density-weighted support vector machines for binary class imbalance learning. Neural Comput Appl 54:1091–1130
Peng X (2010) TSVR: An efficient twin support vector machine for regression. Neural Netw 23(3):365–372
Singla M et al (2020) Robust twin support vector regression based on rescaled hinge loss. Pattern Recognit 105:107395
Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316
Peng X, Xu D, Kong L, Chan D (2016) \(l_1\)-norm loss based twin support vector machine for data recognition. Information Sciences, pp 86–103
Bamakan SMH, Wang H, Shi Y (2017) Ramp loss K-support vector classification- regression; a robust and sparse multi-class approach to the intrusion detection problem. Knowl-Based Syst 126:113–126
Yang L, Dong H (2018) Support vector machine with truncated pinball loss and its application in pattern recognition. Chemo Intell Lab Syst 177:88–99
Yuan C, Yang LM (2021) Capped \(L_{2, p}\)-norm metric based robust least squares twin support vector machine for pattern classification. Neural Netw 142:457–478
Liu D, Shi Y, Tian Y (2015) Ramp loss nonparallel support vector machine for pattern classification. Knowl-Based Syst 85:224–233
Ren Q, Yang L (2022) A robust projection twin support vector machine with a generalized correntropy-based loss. Applied Intelligence (2). https://doi.org/10.1007/s10489-021-02480-6
Balasundaram S, Yogendra M (2018) Robust support vector regression in primal with asymmetric huber loss. Neural Process Lett 3:1–33
Lopez J, Maldonado S (2018) Robust twin support vector regression via second-order cone programming. Knowl-Based Syst 152:83–93
Balasundaram S, Prasad SC (2020) Robust twin support vector regression based on Huber loss function. Neural Comput Appl 32(15):11285-C11309
Gupta D, Gupta U (2021) On robust asymmetric Lagrangian v-twin support vector regression using pinball loss function. Appl Soft Comput 102:107099
He Y, Qi Y, Ye Q, Yu D (2022) Robust least squares twin support vector regression with adaptive FOA and PSO for short-term traffic flow prediction. IEEE Trans Intell Transp Syst 23(9):14542–14556
Liu W, Pokharel P, Principe J (2007) Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans Signal Process 55(1):5286–5298
Xu G, Cao Z, Hu BG et al (2016) Robust support vector machines based on the rescaled hinge loss function. Pattern Recognit 63:139–148
Ren ZH, Yang LM (2018) Correntropy-based robust extreme learning machine for classification. Neurocomput 313:74–84
Yang LM, Dong H (2019) Robust support vector machine with generalized quantile loss for classification and regression. Appl Soft Comput J 81:105483
Singh A, Pokharel R, Principe J (2014) The C-loss function for pattern classification. Pattern Recognit 47(1):441–453
Le Thi HA, Dinh TP, Le HM, Vo XT (2015) DC Approximation approaches for sparse optimization. Eur J Ops Res 244(1):26–46
Yang LM, Zhang SY (2016) A sparse extreme learning machine framework by continuous optimization algorithms and its application in pattern recognition. Eng Appl Artif Intell 53:176–189
Yang L, Sun Q (2016) Comparison of chemometric approaches for near-infrared spectroscopic data. Anal Methods 8(8):1914–1923
Xiang DH, Hu T, Zhou DX (2012) Approximation analysis of learning algorithms for support vector regression and quantile regression. Journal of Applied Mathematics
Liu W, Pokharel PP, Principe JC (2007) Correntropy: Properties and applications in non-gaussian signal processing. IEEE Trans Signal Process 55(11):5286–5298
Suykens Johan AK (2002) Least squares support vector machines. Int J Circ Theor Appl 27(6):605–615
Zhao YP, Zhao J, Min Z (2013) Twin least squares support vector regression. Neurocomput 118:225–236
Anagha P, Balasundaram S, Meena Y (2018) On robust twin support vector regression in primal using squared pinball loss. J Intell Fuzzy Syst 35(5):5231–5239
Blake C, Merz C (1998) UCI Repository for machine learning databases. https://archive.ics.uci.edu/ml/index.php
Yang L, Ren Z, Wang Y, Dong H (2017) A robust regression framework with laplace kernel-induced loss. Neural Comput 29(11):3014–3039
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics Bulletin 1(6):80-83
Randles RH (2006) Wilcoxon signed rank test. John Wiley & Sons Inc
Acknowledgements
This work is supported by National Nature Science Foundation of China (No. 11471010 and No. 11271367). Moreover, the authors thank the referees for their constructive comments to improve paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
All the authors declare that they do not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: DC programming and DCA
Appendix: DC programming and DCA
We outline the main algorithmic results for DC programming. The key to DC programs is to decompose an objective function into the difference of two convex functions, from which a sequence of approximations of the objective function yields a sequence of solutions converging to a stationary point, possibly an optimal solution. Generally speaking, a so-called DC program is to minimize a DC function:
with g(x) and h(x) being convex functions.
The DCA is an iterative algorithm based on local optimality conditions and duality [14-17]. The main scheme of DCA is: (for simplify, we omit here the dual part), at each iteration, one replaces in the primal DC problem (\(P_{dc}\)) the second component h by its affine minorization: \(h(x^{k})+(x-x^{k})^{T}y^{k}\), to generate the convex program:
Where \( \partial h\) is the subdifferential of convex function h. In practice, a simplified form of the DCA is used. Two sequences \(\{x^{k}\}\) and \(\{y^{k}\}\) satisfying \( y^{k}\in \partial h(x^{k})\) are constructed, and \(x^{k+1}\) is a solution to the convex program (54). The simplified DCA is described as follows. Initialization: Choose an initial point \(x^{0}\in R^{n}\) and Let \(k=0\)
Repeat
Calculate \(y^{k}\in \partial h(x^k)\)
Solve convex program (54) to obtain \(x^{k+1} \)
Let k:=k+1
Until some stopping criterion is satisfied.
DCA is a descent method without line search, and it converges linearly for general DC programming.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, M., Zhao, Y. & Yang, L. Robust twin support vector regression with correntropy-based metric. Multimed Tools Appl 83, 45443–45469 (2024). https://doi.org/10.1007/s11042-023-17315-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17315-4