Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A local spline regression-based framework for semi-supervised sparse feature selection

Published: 28 February 2023 Publication History

Abstract

Feature selection (FS) is extensively applied in many machine learning applications for the selection of relevant features from data sets. A lot of unlabeled data are available in a variety of applications that can be exploited for semi-supervised FS to address the lack of labeled data and improve learning performance. Recently, semi-supervised sparse FS based on graph Laplacian has obtained considerable research interest, which uses the correlation between features in the process of FS. However, the Laplacian regularization has a weak extrapolating power and a bias towards the constant geodesic function, and cannot retain the local topology well. In this paper, a spline regression-based framework for semi-supervised sparse FS (SRS3FS) is proposed, which uses the mixed convex and non-convex ℓ 2, p-norm (0 <p ≤1) regularization to select the relevant features and consider the correlation between features. The framework exploits local spline regression to retain the geometry structure of labeled and unlabeled data and encodes the data distribution. A unified iterative algorithm is presented to solve the proposed framework for the convex and non-convex cases, and its convergence is theoretically and experimentally proved. Experiments on several data sets illustrate the effectiveness of our framework in the selection of the most relevant and discriminative features.

References

[1]
Bucak S.S., Jin R., Jain A.K., Multiple kernel learning for visual object recognition: A review, IEEE Trans. Pattern Anal. Mach. Intell. 36 (2013) 1354–1369.
[2]
Zhu X., Li X., Zhang S., Block-row sparse multiview multilabel learning for image classification, IEEE Trans. Cybern. 46 (2015) 450–461.
[3]
Wang L., Wang Y., Chang Q., Feature selection methods for big data bioinformatics: a survey from the search perspective, Methods 111 (2016) 21–31.
[4]
Bai X., Gao X., Xue B., Particle swarm optimization based two-stage feature selection in text mining, in: 2018 IEEE Congress on Evolutionary Computation, CEC, IEEE, 2018, pp. 1–8.
[5]
Fan M., Zhang X., Hu J., Gu N., Tao D., Adaptive data structure regularized multiclass discriminative feature selection, IEEE Trans. Neural Netw. Learn. Syst. 33 (2022) 5859–5872,.
[6]
Xu X., Wu X., Wei F., Zhong W., Nie F., A general framework for feature selection under orthogonal regression with global redundancy minimization, IEEE Trans. Knowl. Data Eng. (2021),.
[7]
Afshar M., Usefi H., Optimizing feature selection methods by removing irrelevant features using sparse least squares, Expert Syst. Appl. 200 (2022),.
[8]
Luo M., Chang X., Nie L., Yang Y., Hauptmann A.G., Zheng Q., An adaptive semisupervised feature analysis for video semantic recognition, IEEE Trans. Cybern. 48 (2018) 648–660,.
[9]
Chen H., Chen H., Li W., Li T., Luo C., Wan J., Robust dual-graph regularized and minimum redundancy based on self-representation for semi-supervised feature selection, Neurocomputing 490 (2022) 104–123.
[10]
Zhang R., Zhang Y., Li X., Unsupervised feature selection via adaptive graph learning and constraint, IEEE Trans. Neural Netw. Learn. Syst. 33 (2022) 1355–1362.
[11]
Sheikhpour R., Sarram M.A., Gharaghani S., Constraint score for semi-supervised feature selection in ligand-and receptor-based QSAR on serine/threonine-protein kinase PLK3 inhibitors, Chemometr. Intell. Lab. Syst. 163 (2017),.
[12]
X. He, D. Cai, P. Niyogi, Laplacian score for feature selection, in: Adv Neural Inf Process Syst, 2005, pp. 507–514.
[13]
Shi L., Du L., Shen Y.-D., Robust spectral learning for unsupervised feature selection, in: 2014 IEEE International Conference on Data Mining, IEEE, 2014, pp. 977–982.
[14]
F. Nie, W. Zhu, X. Li, Unsupervised feature selection with structured graph optimization, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2016.
[15]
Kalakech M., Biela P., Macaire L., Hamad D., Constraint scores for semi-supervised feature selection: A comparative study, Pattern Recognit. Lett. 32 (2011) 656–665,.
[16]
Han J., Pei J., Kamber M., Data Mining: Concepts and Techniques, Elsevier, 2011.
[17]
Song X., Zhang J., Han Y., Jiang J., Semi-supervised feature selection via hierarchical regression for web image classification, Multimedia Syst. (2014),.
[18]
Han Y., Yang Y., Yan Y., Ma Z., Sebe N., Member S., Semisupervised feature selection via spline regression for video semantic recognition, IEEE Trans. Neural Netw. Learn. Syst. 26 (2015) 252–264.
[19]
Luo T., Hou C., Nie F., Tao H., Yi D., Semi-supervised feature selection via insensitive sparse regression with application to video semantic recognition, IEEE Trans. Knowl. Data Eng. 30 (2018) 1943–1956,.
[20]
Pang Q., Zhang L., Semi-supervised neighborhood discrimination index for feature selection, Knowl.-Based Syst. 204 (2020),.
[21]
Zhang R., Zhang Y., Li X., Unsupervised feature selection via adaptive graph learning and constraint, IEEE Trans. Neural Netw. Learn. Syst. (2020),.
[22]
Shi C., Ruan Q., An G., Ge C., Semi-supervised sparse feature selection based on multi-view Laplacian regularization, Image Vis. Comput. 41 (2015) 1–10,.
[23]
Bishop C.M., Neural Networks for Pattern Recognition, Oxford University Press, 1995.
[24]
M. Yang, Y. Chen, G. Ji, Semi_fisher score : a semi-supervised method for feature selection, in: International Conference on Machine Learning and Cybernetics, 2010, pp. 527–532.
[25]
Ma Z., Nie F., Yang Y., Uijlings J.R.R., Sebe N., Member S., Hauptmann A.G., Discriminating joint feature analysis for multimedia data understanding, IEEE Trans. Multimed. 14 (2012) 1662–1672.
[26]
Sheikhpour R., Sarram M.A., Gharaghani S., Chahooki M.A.Z., A robust graph-based semi-supervised sparse feature selection method, Inf. Sci. (N Y). 531 (2020) 13–30,.
[27]
Li X., Zhang Y., Zhang R., Semisupervised feature selection via generalized uncorrelated constraint and manifold embedding, IEEE Trans. Neural Netw. Learn. Syst. 33 (2022) 5070–5079,.
[28]
Sheikhpour R., Sarram M.A., Sheikhpour E., Semi-supervised sparse feature selection via graph Laplacian based scatter matrix for regression problems, Inf. Sci. (N Y). 468 (2018) 14–28,.
[29]
Shi C., Ruan Q., An G., Sparse feature selection based on graph Laplacian for web image annotation, Image Vis. Comput. 32 (2014) 189–201,.
[30]
Chang X., Yang Y., Semisupervised feature analysis by mining correlations among multipe tasks, IEEE Trans. Neural Netw. Learn. Syst. 28 (2016) 2294–2305.
[31]
Zeng Z., Wang X., Zhang J., Wu Q., Semi-supervised feature selection based on local discriminative information, Neurocomputing 173 (2016) 102–109,.
[32]
Wang L., Chen S., L2, p-matrix norm and its application in feature selection, 2013, arXiv Preprint arXiv:1303.3987.
[33]
Shi C., Ruan Q., Member S., An G., Zhao R., Hessian semi-supervised sparse feature selection based on L21/2-matrix norm, IEEE Trans. Multimed. 17 (2015) 16–28.
[34]
Kim K.I., Steinke F., Hein M., Semi-supervised regression using hessian energy with an application to semi-supervised dimensionality reduction, in: Advances in Neural Information Processing Systems, NIPS, MPI for Biological Cybernetics, Germany, 2010, pp. 979–987.
[35]
C. Shi, X. Yan, Web image annotation with semi-supervised feature selection, pp. 225–228.
[36]
A.R., Sobolev Spaces, Academic Press, San Diego, 1975.
[37]
Azadifar S., Rostami M., Berahmand K., Moradi P., Oussalah M., Graph-based relevancy-redundancy gene selection method for cancer diagnosis, Comput. Biol. Med. 147 (2022),.
[38]
Sheikhpour R., Sarram M.A., Gharaghani S., Chahooki M.A.Z., A survey on semi-supervised feature selection methods, Pattern Recognit. 64 (2017) 141–158,.
[39]
Saberi-Movahed F., Rostami M., Berahmand K., Karami S., Tiwari P., Oussalah M., Band S.S., Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection, Knowl.-Based Syst. 256 (2022),.
[40]
Zhu Y., Li W., Li T., A hybrid Artificial Immune optimization for high-dimensional feature selection, Knowl.-Based Syst. 260 (2023),.
[41]
Thirumoorthy K., Jerold John Britto J., A feature selection model for software defect prediction using binary Rao optimization algorithm, Appl. Soft Comput. 131 (2022),.
[42]
Espinosa R., Jiménez F., Palma J., Multi-surrogate assisted multi-objective evolutionary algorithms for feature selection in regression and classification problems with time series data, Inf. Sci. (N Y) (2022),.
[43]
Yuan A., You M., He D., Li X., Convex non-negative matrix factorization with adaptive graph for unsupervised feature selection, IEEE Trans. Cybern. 52 (2022) 5522–5534,.
[44]
F. Nie, H. Huang, X. Cai, C.H. Ding, Efficient and robust feature selection via joint ℓ2, 1-norms minimization, in: Adv Neural Inf Process Syst, 2010, pp. 1813–1821.
[45]
Shang R., Kong J., Zhang W., Feng J., Jiao L., Stolkin R., Uncorrelated feature selection via sparse latent representation and extended OLSDA, Pattern Recognit. 132 (2022),.
[46]
Chen Z., Liu Y., Zhang Y., Jin R., Tao J., Chen L., Low-rank sparse feature selection with incomplete labels for Alzheimer’s disease progression prediction, Comput. Biol. Med. 147 (2022),.
[47]
Wang C., Chen X., Yuan G., Nie F., Yang M., Semisupervised feature selection with sparse discriminative least squares regression, IEEE Trans. Cybern. 52 (2022) 8413–8424,.
[48]
Z. Ma, Y. Yang, F. Nie, J. Uijlings, N. Sebe, Exploiting the entire feature space with sparsity for automatic image annotation, in: Proceedings of the 19th ACM International Conference on Multimedia - MM ’11, 2011, p. 283. https://doi.org/10.1145/2072298.2072336.
[49]
Zhu X., Semi-Supervised Learning Literature Survey, 2008, http://dx.doi.org/10.1.1.146.2352.
[50]
X. Zhu, Z. Ghahramani, J. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in: ICML, 2003, pp. 912–919.
[51]
Nie F., Xu D., Tsang I.W.-H., Zhang C., Flexible manifold embedding: A framework for semi-supervised and unsupervised dimension reduction, IEEE Trans. Image Process. 19 (2010) 1921–1932.
[52]
Duchon J., Splines minimizing rotation-invariant semi-norms in Sobolev spaces, in: Constructive Theory of Functions of Several Variables, Springer, 1977, pp. 85–100.
[53]
Xiang S., Nie F., Zhang C., Zhang C., Nonlinear dimensionality reduction with local spline embedding, IEEE Trans. Knowl. Data Eng. 21 (2009) 1285–1298,.
[54]
Samaria F.S., Harter A.C., Parameterisation of a stochastic model for human face identification, in: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, IEEE, 1994, pp. 138–142.
[55]
Image Engineering Laboratory, The Sheffield UMIST Face Database, http://www.sheffield.ac.uk/eee/research/iel/research/face.
[56]
Hull J.J., A database for handwritten text recognition research, IEEE Trans. Pattern Anal. Mach. Intell. 16 (1994) 550–554.
[57]
Nene S.A., Nayar S.K., Murase H., Columbia Object Image Library (COIL-20), USA, 1996.
[58]
Sim T., Baker S., Bsat M., The CMU pose, illumination, and expression database, IEEE Trans. Pattern Anal. Mach. Intell. 25 (2003) 21–38. http://computer.org/publications/dlib.
[59]
Z. Wang, F. Nie, L. Tian, R. Wang, X. Li, Discriminative feature selection via a structured sparse subspace learning module, in: IJCAI, 2020, pp. 3009–3015.
[60]
Liu Y., Nie F., Wu J., Chen L., Efficient semi-supervised feature selection with noise insensitive trace ratio criterion, Neurocomputing 105 (2013) 12–18,.

Cited By

View all
  • (2024)Adaptive orthogonal semi-supervised feature selection with reliable label matrix learningInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10372761:4Online publication date: 1-Jul-2024
  • (2024)Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDAInformation Sciences: an International Journal10.1016/j.ins.2024.120227662:COnline publication date: 1-Mar-2024
  • (2023)Multi-level correlation learning for multi-view unsupervised feature selectionKnowledge-Based Systems10.1016/j.knosys.2023.111073281:COnline publication date: 3-Dec-2023
  • Show More Cited By

Index Terms

  1. A local spline regression-based framework for semi-supervised sparse feature selection
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Knowledge-Based Systems
          Knowledge-Based Systems  Volume 262, Issue C
          Feb 2023
          472 pages

          Publisher

          Elsevier Science Publishers B. V.

          Netherlands

          Publication History

          Published: 28 February 2023

          Author Tags

          1. Semi-supervised feature selection
          2. Spline regression
          3. Sparse models

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 02 Oct 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Adaptive orthogonal semi-supervised feature selection with reliable label matrix learningInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10372761:4Online publication date: 1-Jul-2024
          • (2024)Unsupervised feature selection via dual space-based low redundancy scores and extended OLSDAInformation Sciences: an International Journal10.1016/j.ins.2024.120227662:COnline publication date: 1-Mar-2024
          • (2023)Multi-level correlation learning for multi-view unsupervised feature selectionKnowledge-Based Systems10.1016/j.knosys.2023.111073281:COnline publication date: 3-Dec-2023
          • (2023)Hessian-based semi-supervised feature selection using generalized uncorrelated constraintKnowledge-Based Systems10.1016/j.knosys.2023.110521269:COnline publication date: 7-Jun-2023

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media