Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Efficient feature selection and classification algorithm based on PSO and rough sets

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The high-dimensional data are often characterized by more number of features with less number of instances. Many of the features are irrelevant and redundant. These features may be especially harmful in case of extreme number of features carries the problem of memory usage in order to represent the datasets. On the other hand relatively small training set, where this irrelevancy and redundancy makes harder to evaluate. Hence, in this paper we propose an efficient feature selection and classification method based on Particle Swarm Optimization (PSO) and rough sets. In this study, we propose the inconsistency handler algorithm for handling inconsistency in dataset, new quick reduct algorithm for handling irrelevant/noisy features and fitness function with three parameters, the classification quality of feature subset, remaining features and the accuracy of approximation. The proposed method is compared with two traditional and three fusion of PSO and rough set-based feature selection methods. In this study, Decision Tree and Naive Bayes classifiers are used to calculate the classification accuracy of the selected feature subset on nine benchmark datasets. The result shows that the proposed method can automatically selects small feature subset with better classification accuracy than using all features. The proposed method also outperforms the two traditional and three existing PSO and rough set-based feature selection methods in terms of the classification accuracy, cardinality of feature and stability indices. It is also observed that with increased weight on the classification quality of feature subset of the fitness function, there is a significant reduction in the cardinality of features and also achieve better classification accuracy as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Settouti N, Bechar MEA, Chikh MA (2016) Statistical comparisons of the top 10 algorithms in data mining for classification task. Int J Interact Multimed Artif Intell Spec Issue Artif Intell 4:46–51 (Underpinning)

    Google Scholar 

  2. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156

    Article  Google Scholar 

  3. Pujari JD, Yakkundimath R, Byadgi A et al. (2016) SVM and ANN based classification of plant diseases using feature reduction technique. Int J Interact Multimed Artif Intell 3(7):1–9

    Google Scholar 

  4. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356

    Article  MATH  Google Scholar 

  5. Pawlak Z (2012) Rough sets: theoretical aspects of reasoning about data, vol 9. Springer, New York

    MATH  Google Scholar 

  6. Pawlak Z (1997) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57

    Article  MATH  Google Scholar 

  7. Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873

    Article  Google Scholar 

  8. Cervante L, Xue B, Shang L, Zhang M (2013) Binary particle swarm optimisation and rough set theory for dimension reduction in classification. In: 2013 IEEE congress on evolutionary computation. IEEE, pp 2428–2435

  9. Bae C, Yeh W-C, Chung YY, Liu S-L (2010) Feature selection with intelligent dynamic swarm and rough set. Expert Syst Appl 37(10):7026–7032

    Article  Google Scholar 

  10. Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recogn 33(1):25–41

    Article  Google Scholar 

  11. Cervante L, Xue B, Shang L, Zhang M (2013) A multi-objective feature selection approach based on binary pso and rough set theory. In: European conference on evolutionary computation in combinatorial optimization. Springer, pp 25–36

  12. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324

    Article  MATH  Google Scholar 

  13. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182

    MATH  Google Scholar 

  14. Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103

    Article  MATH  Google Scholar 

  15. Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240

    Article  Google Scholar 

  16. Stein G, Chen B, Wu AS, Hua KA (2005) Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd annual Southeast regional conference, vol 2. ACM, pp 136–141

  17. Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36(1):106–117

    Article  Google Scholar 

  18. Al-Ani A (2005) Feature subset selection using ant colony optimization. Int J Comput Intell 2(1):53–58

    Google Scholar 

  19. Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539

    Article  MATH  Google Scholar 

  20. Meza J, Espitia H, Montenegro C, Giménez E, González-Crespo R (2017) MOVPSO: vortex multi-objective particle swarm optimization. Appl Soft Comput 52:1042–1057

    Article  Google Scholar 

  21. Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: The 1998 IEEE international conference on evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence. IEEE, pp 69–73

  22. Kennedy J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Berlin, pp 760–766

    Google Scholar 

  23. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57

    Article  Google Scholar 

  24. Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536

    Article  Google Scholar 

  25. Yao Y, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373

    Article  MathSciNet  MATH  Google Scholar 

  26. Clerc M (2012) Standard particle swarm optimisation. https://hal.archives-ouvertes.fr/hal-00764996

  27. Banka H, Dara S (2015) A hamming distance based binary particle swarm optimization (hdbpso) algorithm for high dimensional feature selection, classification and validation. Pattern Recogn Lett 52:94–100

    Article  Google Scholar 

  28. Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato

  29. Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1–2):279–305

    Article  MathSciNet  MATH  Google Scholar 

  30. Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103

    Article  MATH  Google Scholar 

  31. Marill T, Green D (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17

    Article  Google Scholar 

  32. Stearns, Stephen D (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international joint conference on pattern recognition, pp 71–75

  33. Chakraborty B (2002) Genetic algorithm with fuzzy fitness function for feature selection. In: IEEE international symposium on industrial electronics (ISIE02), vol 1, pp 315–319

  34. Chakraborty B (2008) Feature subset selection by particle swarm optimization with fuzzy fitness function. In: 3rd international conference on intelligent system and knowledge engineering, 2008. ISKE 2008, vol 1. IEEE, pp 1038–1042

  35. Neshatian K, Zhang M (2009) Pareto front feature selection: using genetic programming to explore feature space. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1027–1034

  36. Jensen R (2006) Performing feature selection with ACO. Swarm intelligence in data mining. Springer, Berlin, pp 45–73

    Chapter  Google Scholar 

  37. El Aziz MA, Hassanien AE (2016) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2473-7

    Article  Google Scholar 

  38. Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471

    Article  Google Scholar 

  39. Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233

    Article  Google Scholar 

  40. Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian joint conference on artificial intelligence. Springer, pp 313–325

  41. Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849

    Article  MATH  Google Scholar 

  42. Xue B, Cervante L, Shang L, Browne WN, Zhang M (2014) Binary pso and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(02):1450009

    Article  Google Scholar 

  43. Inbarani HH, Azar AT, Jothi G (2014) Supervised hybrid feature selection based on pso and rough sets for medical diagnosis. Comput Methods Programs Biomed 113(1):175–185

    Article  Google Scholar 

  44. Frank A, Asuncion A (2010) UCI machine learning repository. School of information and computer science, vol 213. http://archive.ics.uci.edu/ml

  45. Witten Ian H, Eibe F (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Los Altos

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramesh Kumar Huda.

Ethics declarations

Conflict of interest

We have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huda, R.K., Banka, H. Efficient feature selection and classification algorithm based on PSO and rough sets. Neural Comput & Applic 31, 4287–4303 (2019). https://doi.org/10.1007/s00521-017-3317-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-017-3317-9

Keywords

Navigation