Nothing Special   »   [go: up one dir, main page]

Skip to main content

Fuzzy Information Measures Feature Selection Using Descriptive Statistics Data

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13370))

  • 2068 Accesses

Abstract

Feature selection (FS) has proven its importance as a preprocessing for improving classification performance. The success of FS methods depends on extracting all the possible relations among features to estimate their informative amount well. Fuzzy information measures are powerful solutions that extract the different feature relations without information loss. However, estimating fuzzy information measures consumes high resources such as space and time. To reduce the high cost of these resources, this paper proposes a novel method to generate FS based on fuzzy information measures using descriptive statistics data (DS) instead of the original data (OD). The main assumption behind this is that the descriptive statistics of features can hold the same relations as the original features. Over 15 benchmark datasets, the effectiveness of using DS has been evaluated on five FS methods according to the classification performance and feature selection cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://archive.ics.uci.edu/ml/datasets.php.

  2. 2.

    https://github.com/klainfo/NASADefectDataset.

References

  1. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection for high-dimensional data. Progr. Artif. Intell. 5(2), 65–75 (2016). https://doi.org/10.1007/s13748-015-0080-y

    Article  Google Scholar 

  2. Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., Lang, M.: Benchmark for filter methods for feature selection in high-dimensional classification data. Comput. Stat. Data Anal. 143, 106839 (2020)

    Google Scholar 

  3. Cateni, S., Colla, V., Vannucci, M.: A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135, 32–41 (2014)

    Google Scholar 

  4. Cavallaro, M., Fidell, L.: Basic descriptive statistics: commonly encountered terms and examples. Am. J. EEG Technol. 34(3), 138–152 (1994)

    Google Scholar 

  5. Cheng, J., Greiner, R.: Comparing bayesian network classifiers. arXiv preprint arXiv:1301.6684 (2013)

  6. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications, vol. 207. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-35488-8

  7. Hu, Q., Xie, Z., Yu, D.: Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn. 40(12), 3509–3521 (2007)

    Google Scholar 

  8. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recogn. Lett. 27(5), 414–423 (2006)

    Google Scholar 

  9. Kaur, P., Stoltzfus, J., Yellapu, V., et al.: Descriptive statistics. Int. J. Acad. Medi. 4(1), 60 (2018)

    Google Scholar 

  10. Laradji, I.H., Alshayeb, M., Ghouti, L.: Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015)

    Google Scholar 

  11. Lohrmann, C., Luukka, P., Jablonska-Sabuka, M., Kauranne, T.: A combination of fuzzy similarity measures and fuzzy entropy measures for supervised feature selection. Exp. Syst. Appl. 110, 216–236 (2018)

    Google Scholar 

  12. Luukka, P.: Feature selection using fuzzy entropy measures with similarity classifier. Exp. Syst. Appl. 38(4), 4600–4607 (2011)

    Google Scholar 

  13. Patrick, E.A., Fischer, F.P.: A generalized k-nearest neighbor rule. Inf. Control 16(2), 128–152 (1970)

    Google Scholar 

  14. Raza, M.S., Qamar, U.: Feature selection using rough set-based direct dependency calculation by avoiding the positive region. Int. J. Approx. Reason. 92, 175–197 (2018)

    Google Scholar 

  15. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. bioinformatics 23(19), 2507–2517 (2007)

    Google Scholar 

  16. Salem, O.A., Liu, F., Chen, Y.P.P., Chen, X.: Ensemble fuzzy feature selection based on relevancy, redundancy, and dependency criteria. Entropy 22(7), 757 (2020)

    Google Scholar 

  17. Salem, O.A., Liu, F., Chen, Y.P.P., Chen, X.: Feature selection and threshold method based on fuzzy joint mutual information. Int. J. Approx. Reason. 132, 107–126 (2021)

    Google Scholar 

  18. Salem, O.A., Liu, F., Chen, Y.P.P., Hamed, A., Chen, X.: Fuzzy joint mutual information feature selection based on ideal vector. Exp. Syst. Appl. 193, 116453 (2022)

    Google Scholar 

  19. Shen, Z., Chen, X., Garibaldi, J.: Performance optimization of a fuzzy entropy based feature selection and classification framework. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1361–1367. IEEE (2018)

    Google Scholar 

  20. Tsai, Y.S., Yang, U.C., Chung, I.F., Huang, C.D.: A comparison of mutual and fuzzy-mutual information-based feature selection strategies. In: 2013 IEEE International Conference on, Fuzzy Systems (FUZZ), pp. 1–6. IEEE (2013)

    Google Scholar 

  21. Vergara, J.R., Estévez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24(1), 175–186 (2013). https://doi.org/10.1007/s00521-013-1368-0

  22. Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2015)

    Google Scholar 

  23. Yu, D., An, S., Hu, Q.: Fuzzy mutual information based min-redundancy and max-relevance heterogeneous feature selection. Int. J. Comput. Intell. Syst. 4(4), 619–633 (2011)

    Google Scholar 

Download references

Acknowledgement

This research has been supported by the National Natural Science Foundation of China (62172309).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Haowen Liu , Feng Liu or Xi Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Salem, O.A.M., Liu, H., Liu, F., Chen, YP.P., Chen, X. (2022). Fuzzy Information Measures Feature Selection Using Descriptive Statistics Data. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds) Knowledge Science, Engineering and Management. KSEM 2022. Lecture Notes in Computer Science(), vol 13370. Springer, Cham. https://doi.org/10.1007/978-3-031-10989-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10989-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10988-1

  • Online ISBN: 978-3-031-10989-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics