Abstract
Developing a good prediction model from excessive-dimensional data in machine intelligence is a difficult endeavor. Attribute subset collection is a crucial step in data preprocessing to find out the best attribute group for improving the efficiency of the directed learning methods. Attribute reduction approaches has been used in many research fields, which deals with the intelligent machine and its capability to learning and giving quick, economical and greater correct for classifying methods. Selecting the small attribute set is a crucial step for making an intelligent system. The current study has shown that meaningful knowledge can be discovered by applying Machine Learning techniques on medical dataset for disease diagnosis. But due to high dimensional of medical data, there are big computational challenges in this field. If the staging data have inappropriate characteristics, it may degrade classifier accuracy and less comprehensible outcomes. And here, the role of attribute choice plays a crucial role. In this study, we applied a wrapper-based strategy with Particle Swarm Optimization, Genetic Algorithm, and Greedy Hill Climbing as an attribute-oriented approach for medical dataset. The efficiency of the PSO search method is evaluated with the Genetic search method and the Greedy search algorithm. We applied our proposed approach to the three medical databases. The outcomes showed that Particle Swarm Optimization method and Genetic Search method can improve the classification performance and outperformed over the Greedy feature selection technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Machine Learning Proceedings (pp. 121–129). Morgan Kaufmann.
Hall, M. A. (1999). Correlation-based feature selection for machine learning.
Inza, I., Sierra, B., Blanco, R., & Larrañaga, P. (2002). Gene selection by sequential search wrapper approaches in microarray cancer class prediction. Journal of Intelligent & Fuzzy Systems, 12(1), 25–33.
Sharma, A., Imoto, S., & Miyano, S. (2012). A top-r feature selection algorithm for microarray gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(3), 754–764.
Li, L., Darden, T. A., Weingberg, C. R., Levine, A. J., & Pedersen, L. G. (2001). Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Combinatorial Chemistry & High Throughput Screening, 4(8), 727–739.
Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A survey on evolutionary computation approaches to feature selection. IEEE Transactions on Evolutionary Computation, 20(4), 606–626.
Roslina, A. H., & Noraziah, A. (2010). Prediction of hepatitis prognosis using support vector machines and wrapper method. In 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (Vol. 5, pp. 2209–2211). IEEE.
Harb, H. M., & Desuky, A. S. (2014). Feature selection on classification of medical datasets based on particle swarm optimization. International Journal of Computer Applications, 104(5).
Aslam, M. W., Zhu, Z., & Nandi, A. K. (2013). Feature generation using genetic programming with comparative partner selection for diabetes classification. Expert Systems with Applications, 40(13), 5402–5412.
Sideris, C., Pourhomayoun, M., Kalantarian, H., & Sarrafzadeh, M. (2016). A flexible data-driven comorbidity feature extraction framework. Computers in Biology and Medicine, 73, 165–172.
Ganji, M. F., & Abadeh, M. S. (2011). A fuzzy classification system based on ant colony optimization for diabetes disease diagnosis. Expert Systems with Applications, 38(12), 14650–14659.
Ozcift, A., & Gulten, A. (2011). Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Computer Methods and Programs in Biomedicine, 104(3), 443–451.
Anderson, J. P., Parikh, J. R., Shenfeld, D. K., Ivanov, V., Marks, C., Church, B. W., et al. (2016). Reverse engineering and evaluation of prediction models for progression to type 2 diabetes: An application of machine learning using electronic health records. Journal of Diabetes Science and Technology, 10(1), 6–18.
Tan, K. C., Teoh, E. J., Yu, Q., & Goh, K. C. (2009). A hybrid evolutionary algorithm for attribute selection in data mining. Expert Systems with Applications, 36(4), 8616–8630.
Malav, A., Kadam, K., & Kamat, P. (2017). Prediction of heart disease using k-means and artificial neural network as hybrid approach to improve accuracy. International Journal of Engineering and Technology, 9(4), 3081–3085.
Shah, M., Marchand, M., & Corbeil, J. (2012). Feature selection with conjunctions of decision stumps and learning from microarray data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1), 174–186.
Song, M. H., Lee, J., Cho, S. P., Lee, K. J., & Yoo, S. K. (2005). Support vector machine based arrhythmia classification using reduced features.
Korn, F., Pagel, B. U., & Faloutsos, C. (2001). On the “dimensionality curse” and the “self-similarity blessing”. IEEE Transactions on Knowledge and Data Engineering, 13(1), 96–111.
Hoque, N., Bhattacharyya, D. K., & Kalita, J. K. (2014). MIFS-ND: A mutual information-based feature selection meth- od. Expert Systems with Applications, 41(14), 6371–6385.
Liu, H., & Motoda, H. (Eds.). (1998). Feature extraction, construction and selection: A data mining perspective (Vol. 453). Springer Science & Business Media.
Kim, Y., Street, W. N., & Menczer, F. (2002). Evolutionary model selection in unsupervised learning. Intelligent Data Analysis, 6(6), 531–556.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI (Vol. 14, No. 2, pp. 1137–1145).
Ginsberg, M. (2012). Essentials of Artificial Intelligence. Newnes.
Russell, S. J., & Norvig, P. (2016). Artificial intelligence: A modern approach. Malaysia: Pearson Education Limited.
Cawley, G. C., Talbot, N. L., & Girolami, M. (2007). Sparse multinomial logistic regression via Bayesian L1 regularization. In Advances in Neural Information Processing Systems (pp. 209–216).
Oh, I. S., Lee, J. S., & Moon, B. R. (2004). Hybrid genetic algorithms for feature selection. EEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1424–1437.
Kumar, S., Jain, S., & Sharma, H. (2018). Genetic algorithms. In Advances in swarm intelligence for optimizing problems in computer science (pp. 27–52). Chapman and Hall/CRC.
Kumar, S., Sharma, B., Sharma, V. K., & Poonia, R. C. (2018). Automated soil prediction using bag-of-features and chaotic spider monkey optimization algorithm. Evolutionary Intelligence, 1–12.
Kumar, S., Sharma, B., Sharma, V. K., Sharma, H., & Bansal, J. C. (2018). Plant leaf disease identification using exponential spider monkey optimization. In Sustainable computing: Informatics and systems.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.
bin Basir, M. A., & binti Ahmad, F. (2014). Comparison on swarm algorithms for feature selections/reductions. International Journal of Scientific & Engineering.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kewat, A., Srivastava, P.N., Kumhar, D. (2020). Performance Evaluation of Wrapper-Based Feature Selection Techniques for Medical Datasets. In: Sharma, H., Govindan, K., Poonia, R., Kumar, S., El-Medany, W. (eds) Advances in Computing and Intelligent Systems. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0222-4_60
Download citation
DOI: https://doi.org/10.1007/978-981-15-0222-4_60
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0221-7
Online ISBN: 978-981-15-0222-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)