Abstract
In machine learning – and in data processing in general – it is very important to select the proper number of features. If we select too few, we miss important information and do not get good results, but if we select too many, this will include many irrelevant ones that only bring noise and thus again worsen the results. The usual method of selecting the proper number of features is to add features one by one until the quality stops improving and starts deteriorating again. This method works, but it often takes too much time. In this paper, we propose faster – even asymptotically optimal – methods for solving the problem.
This work was supported in part by the National Science Foundation grants 1623190 (A Model of Change for Preparing a New Generation for Professional Practice in Computer Science), HRD-1834620 and HRD-2034030 (CAHSI Includes), EAR-2225395, and by the AT &T Fellowship in Information Technology.
It was also supported by the program of the development of the Scientific-Educational Mathematical Center of Volga Federal District No. 075-02-2020-1478, and by a grant from the Hungarian National Research, Development and Innovation Office (NRDI).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2022)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. Chapman and Hall/CRC, Boca Raton (2011)
Acknowledgements
The authors are greatly thankful to the anonymous reviewers for valuable suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tizpaz-Niari, S., Longpré, L., Kosheleva, O., Kreinovich, V. (2023). Fast – Asymptotically Optimal – Methods for Determining the Optimal Number of Features. In: Huynh, VN., Le, B., Honda, K., Inuiguchi, M., Kohda, Y. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2023. Lecture Notes in Computer Science(), vol 14375. Springer, Cham. https://doi.org/10.1007/978-3-031-46775-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-46775-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46774-5
Online ISBN: 978-3-031-46775-2
eBook Packages: Computer ScienceComputer Science (R0)