Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Improving medical diagnosis performance using hybrid feature selection via relieff and entropy based genetic search (RF-EGA) approach: application to breast cancer prediction

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In this research a new hybrid prediction algorithm for breast cancer has been made from a breast cancer data set. Many approaches are available in diagnosing the medical diseases like genetic algorithm, ant colony optimization, particle swarm optimization, cuckoo search algorithm, etc., The proposed algorithm uses a ReliefF attribute reduction with entropy based genetic algorithm for breast cancer detection. The hybrid combination of these techniques is used to handle the dataset with high dimension and uncertainties. The data are obtained from the Wisconsin breast cancer dataset; these data have been categorized based on different properties. The performance of the proposed method is evaluated and the results are compared with other well known feature selection methods. The obtained result shows that the proposed method has a remarkable ability to generate reduced-size subset of salient features while yielding significant classification accuracy for large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)

    MATH  Google Scholar 

  2. Goldberg, D.E.: Genetic Algorithm in Search, Optimization & Machine Learning. Addison Wesley, Reading (1989)

    MATH  Google Scholar 

  3. Kononenko, I.: Estimation attributes: analysis and Extensions of RELIEF. In: Proceedings of the 1994 European Conference on Machine Learning, pp. 171–182 (1994)

    Chapter  Google Scholar 

  4. Yang, P., Zhang, Z.: An embedded two-layer feature selection approach for microarray data analysis. EEE Intell. Inf. Bull. 10(1), 24–32 (2009)

    Google Scholar 

  5. Huerta, E.B.: A Hybrid GA/SVM approach for gene selection and classification of microarray data. pp. 34–44 (2006)

  6. Olaniyi, E.O., Oyedotun, O.K., Adnan, K.: Heart diseases diagnosis using neural networks arbitration. Int. J. Intell. Syst. Appl. (IJISA) 7(12), 75 (2015)

    Google Scholar 

  7. Hsieh, S.L., Hsieh, S.H., Cheng, P.H., et al.: Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 36(5), 2841–2847 (2012)

    Article  Google Scholar 

  8. Sallehuddin, R., Ubaidillah, S.H., Mustaffa, N.H.: Classification of liver cancer using artificial neural network and support vector machine. In: Proceedings of International Conference on Advance in Communication Network, and Computing, Elsevier Science, CNC (2014)

  9. Long, N.C., Meesad, P., Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)

    Article  Google Scholar 

  10. Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Heart disease prediction system using associative classification and genetic algorithm. (2012)

  11. Kim, J.K., Lee, J.S., Park, D.K., Lim, Y.S., Lee, Y.H., Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)

    Article  Google Scholar 

  12. Choubey, D.K., Sanchita, P.: GAXXSlahUndXXMLP NN: a hybrid intelligent system for diabetes disease diagnosis. Int. J. Intell. Syst. Appl. 8(1), 49 (2016)

    Google Scholar 

  13. Ordonez, C., Omiecinski, E., De Braal L. et al.: Mining constrained association rules to predict heart disease. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 433–440. San Jose, CA, USA (2001)

  14. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Ed. Leslie Pack Kaelbling. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  15. Wang, H., Khoshgoftaar, T.M., Van Hulse, J., Gao, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)

    Article  Google Scholar 

  16. Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pp. 855–858 (1997)

  17. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene experssion data. J. Bioinf. Comput. Biol. 3(2), 185–205 (2005)

    Article  Google Scholar 

  18. Jayaram, M.A., Karegowda, A.G., Manjunath, A.S.: Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7), 13–16 (2010)

    Google Scholar 

  19. Unler, A., Murat, A., Chinnam, R.B.: mr 2 PSO: a maximum relevance minimum redundancy approach based on swarm intelligence for support vector machine classification. Inf. Sci. 181(20), 4625–4641 (2011)

    Article  Google Scholar 

  20. Jensen, R., Shen, Q.: Fuzzy-rough data reduction with ant colony optimization. Present. Fuzzy Sets Syst. 149, 5–20 (2005)

    Article  MathSciNet  Google Scholar 

  21. Zhang, C.K., Hu, H.: Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1728–1732 (2005)

  22. Liu, H., Setiono, R.: A probabilistic approach to feature selection—a filter solution. In Proceedings of the 13th International Conference on Machine Learning, pp. 319–327 (1996)

  23. Kent ridge bio-medical data set repository World Wide Web. http://datam.i2r.a-star.edu.sg/datasets/krbd

  24. www.cs.waikato.ac.nz/ml/weka

  25. Hualong, B., Jing, X.: Hybrid feature selection mechanism based high dimensional date sets reduction. Energy Proc. 11(1), 4973–4978 (2011)

    Article  Google Scholar 

  26. Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm based method for feature subset selection. Soft Comput. 11(1), 111–120 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilangovan Sangaiah.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sangaiah, I., Vincent Antony Kumar, A. Improving medical diagnosis performance using hybrid feature selection via relieff and entropy based genetic search (RF-EGA) approach: application to breast cancer prediction. Cluster Comput 22 (Suppl 3), 6899–6906 (2019). https://doi.org/10.1007/s10586-018-1702-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1702-5

Keywords

Navigation