Abstract
We propose an unsupervised, model-agnostic, wrapper method for feature selection. We assume that if a feature can be predicted using the others, it adds little information to the problem, and therefore could be removed without impairing the performance of whatever model will be eventually built. The proposed method iteratively identifies and removes predictable, or nearly-predictable, redundant features, allowing to trade-off complexity with expected quality. The approach do not rely on target labels nor values, and the model used to identify predictable features is not related to the final use of the feature set. Therefore, it can be used for supervised, unsupervised, or semi-supervised problems, or even as a safe, pre-processing step to improve the quality of the results of other feature selection techniques. Experimental results against state-of-the-art feature-selection algorithms show satisfying performance on several non-trivial benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barbiero, P., Lutton, E., Squillero, G., Tonda, A.: A novel outlook on feature selection as a multi-objective problem. In: Idoumghar, L., Legrand, P., Liefooghe, A., Lutton, E., Monmarché, N., Schoenauer, M. (eds.) EA 2019. LNCS, vol. 12052, pp. 68–81. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45715-0_6
Barbiero, P., Squillero, G., Tonda, A.: Modeling generalization in machine learning: a methodological and computational study. arXiv preprint arXiv:2006.15680 (2020)
Bermingham, M., et al.: Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep. 5, 10312 (2015). https://doi.org/10.1038/srep10312
Cai, D., Zhang, C., He, X.: Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 333–342 (2010)
Chien, Y., Fu, K.S.: On the generalized Karhunen-Loéve expansion. IEEE Trans. Inf. Theor. 13(3), 518–520 (1967)
Cilia, N.D., De Stefano, C., Fontanella, F., Scotto di Freca, A.: Variable-length representation for EC-based feature selection in high-dimensional data. In: Kaufmann, P., Castillo, P.A. (eds.) EvoApplications 2019. LNCS, vol. 11454, pp. 325–340. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16692-2_22
Erickson, N., et al.: AutoGluon-Tabular: robust and accurate AutoML for structured data. arXiv preprint arXiv:2003.06505 (2020)
Fanty, M., Cole, R.: Spoken letter recognition. In: Advances in Neural Information Processing Systems, pp. 220–226 (1991)
Fisher, R.A.: XV.-The correlation between relatives on the supposition of mendelian inheritance. Earth Environ. Sci. Trans. R. Soc. Edinburgh 52(2), 399–433 (1919)
Guyon, I.: Design of experiments of the NIPS 2003 variable selection benchmark. In: NIPS 2003 Workshop on Feature Extraction and Feature Selection (2003)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)
Hamdani, T.M., Won, J.-M., Alimi, A.M., Karray, F.: Multi-objective feature selection with NSGA II. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds.) ICANNGA 2007. LNCS, vol. 4431, pp. 240–247. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71618-1_27
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Advances in Neural Information Processing Systems, pp. 507–514 (2006)
Kozachenko, L., Leonenko, N.N.: Sample estimate of the entropy of a random vector. Problemy Peredachi Informatsii 23(2), 9–16 (1987)
Lewis, P.: The characteristic selection problem in recognition systems. IRE Trans. inf. Theor. 8(2), 171–178 (1962)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) 50(6), 94 (2018)
Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: 26th AAAI Conference on Artificial Intelligence (2012)
Pedregosa, F., et al.: scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Steel, R.G.D., Torrie, J.H., et al.: Principles and Procedures of Statistics (1960)
Steinhaus, H.: Sur la division des corp materiels en parties. Bull. Acad. Polon. Sci. 1(804), 801 (1956)
Tsai, F.S.: Dimensionality reduction for computer facial animation. Exp. Syst. Appl. 39(5), 4965–4971 (2012). https://doi.org/10.1016/j.eswa.2011.10.018
Turner, M.C., Krewski, D., Pope, C.A., III., Chen, Y., Gapstur, S.M., Thun, M.J.: Long-term ambient fine particulate matter air pollution and lung cancer in a large cohort of never-smokers. Am. J. Respir. Crit. Care Med. 184(12), 1374–1381 (2011)
Van Rijsbergen, C.J.: Information Retrieval. 2nd edn. Butterworth-Heinemann, Newton, MA (1979)
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013). https://doi.org/10.1145/2641190.2641198
Vergara, A., Vembu, S., Ayhan, T., Ryan, M.A., Homer, M.L., Huerta, R.: Chemical gas sensor drift compensation using classifier ensembles. Sens. Actuators B Chem. 166, 320–329 (2012)
Vignolo, L.D., Milone, D.H., Scharcanski, J.: Feature selection for face recognition based on multi-objective evolutionary wrappers. Exp. Syst. Appl. 40(13), 5077–5084 (2013)
Ward, J.H., Jr.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. In: Advances in Neural Information Processing Systems 13, pp. 668–674. MIT Press (2000)
Xue, B., Fu, W., Zhang, M.: Multi-objective feature selection in classification: a differential evolution approach. In: Dick, G., et al. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 516–528. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13563-2_44
Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2015)
Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised. In: 22nd International Joint Conference on Artificial Intelligence (2011)
Zhao, Z., Liu, H.: Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1151–1157 (2007)
Zhou, Z., Li, S., Qin, G., Folkert, M., Jiang, S., Wang, J.: Multi-objective based radiomic feature selection for lesion malignancy classification. IEEE J. Biomed. Health Inform. 24, 194–204 (2019)
Zill, D., Wright, W.S., Cullen, M.R.: Advanced Engineering Mathematics. Jones & Bartlett Learning (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Barbiero, P., Squillero, G., Tonda, A. (2022). Predictable Features Elimination: An Unsupervised Approach to Feature Selection. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13163. Springer, Cham. https://doi.org/10.1007/978-3-030-95467-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-95467-3_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95466-6
Online ISBN: 978-3-030-95467-3
eBook Packages: Computer ScienceComputer Science (R0)