Abstract
K-nearest neighbors (kNN) is a popular machine learning algorithm because of its clarity, simplicity, and efficacy. kNN has numerous drawbacks, including ignoring issues like class distribution, feature relevance, neighbor contribution, and the number of individuals for each class. In particular, some features could be more important than others for classifying a data point, and increasing their weight in the distance computation can make the kNN algorithm more accurate. Researchers propose different feature weightings, such as correlation-based feature selection, mutual information, and chi-square feature selection. This paper presents a new feature weighting technique based on association rules and information gain. The proposed approach gives a good performance compared to other similar methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets used during the current study are freely available in the UCI repository (Lichman, 2013) and Kaggle (www.kaggle.com/datasets).
Code Availability
The code is available upon reasonable request.
References
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp 207–216
Agrawal R, Srikant R et al (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, Santiago, Chile, pp 487–499
Aguilera J, González LC, Montes-y Gómez M, et al (2018) A new weighted k-nearest neighbor algorithm based on newton’s gravitational force. In: Iberoamerican Congress on Pattern Recognition, Springer, pp 305–313
Almomany A, Ayyad WR, Jarrah A (2022) Optimized implementation of an improved knn classification algorithm using intel fpga platform: Covid-19 case study. J King Saud Univ Comput Inf Sci 34(6):3815–3827
AlSukker A, Khushaba R, Al-Ani A (2010) Optimizing the k-nn metric weights using differential evolution. In: 2010 International Conference on Multimedia Computing and Information Technology (MCIT), IEEE, pp 89–92
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Asuncion A, Newman D (2007) Uci machine learning repository
Bhattacharya G, Ghosh K, Chowdhury AS (2017) Granger causality driven ahp for feature weighted knn. Pattern Recogn 66:425–436
Biswas N, Chakraborty S, Mullick SS et al (2018) A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recogn Lett 101:80–87
Chakravarthy SS, Bharanidharan N, Rajaguru H (2023) Deep learning-based metaheuristic weighted k-nearest neighbor algorithm for the severity classification of breast cancer. IRBM 44(3):100749
Chen Y, Hao Y (2017) A feature weighted support vector machine and k-nearest neighbor algorithm for stock market indices prediction. Expert Syst Appl 80:340–355
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Davis JV, Kulis B, Jain P, et al (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning, pp 209–216
Derrac J, García S, Herrera F (2014) Fuzzy nearest neighbor algorithms: taxonomy, experimental analysis and prospects. Inf Sci 260:98–119
Duda R, Hart P, Stork DG (2001) Pattern classification. Hoboken
Fahad LG, Tahir SF (2021) Activity recognition in a smart home using local feature weighting and variants of nearest-neighbors classifiers. J Ambient Intell Hum Comput 12:2355–2364
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
Ganaie M, Tanveer M, Initiative ADN et al (2022) Knn weighted reduced universum twin svm for class imbalance learning. Knowl Based Syst 245:108578
García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR et al (2009) K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing 72(7–9):1483–1493
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2):1–12
Han J, Kamber M, Mining D (2006) Concepts and techniques. Morgan Kaufmann, pp 94104–3205
Hssina B, Merbouha A, Ezzikouri H et al (2014) A comparative study of decision tree id3 and c4. 5. Int J Adv Comput Sci Appl 4(2):13–19
Huang GB, Zhou H, Ding X et al (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B (Cybern) 42(2):513–529
Huang J, Wei Y, Yi J, et al (2018) An improved knn based on class contribution and feature weighting. In: 2018 10th international conference on measuring technology and mechatronics automation (ICMTMA), IEEE, pp 313–316
Jiao L, Geng X, Pan Q (2019) Bp \( k \) nn: \( k \)-nearest neighbor classifier with pairwise distance metrics and belief function theory. IEEE Access 7:48935–48947
Karabulut B, Arslan G, Ünver HM (2019) A weighted similarity measure for k-nearest neighbors algorithm. Celal Bayar Univ J Sci 15(4):393–400
Kononenko I, Šimec E, Robnik-Šikonja M (1997) Overcoming the myopia of inductive learning algorithms with relieff. Appl Intell 7:39–55
Kuok CM, Fu A, Wong MH (1998) Mining fuzzy association rules in databases. ACM Sigmod Record 27(1):41–46
Li D, Gu M, Liu S et al (2022) Continual learning classification method with the weighted k-nearest neighbor rule for time-varying data space based on the artificial immune system. Knowl Based Syst 240:108145
Liu M, Vemuri BC (2012) A robust and efficient doubly regularized metric learning approach. In: Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7–13, 2012, Proceedings, Part IV 12, Springer, pp 646–659
Lu S, Yue Y, Liu X et al (2022) A novel unbalanced weighted knn based on svm method for pipeline defect detection using eddy current measurements. Meas Sci Technol 34(1):014001
Mendel JM, John RB (2002) Type-2 fuzzy sets made simple. IEEE Trans Fuzzy Syst 10(2):117–127
Nagaraj P, Saiteja K, Ram KK, et al (2022) University recommender system based on student profile using feature weighted algorithm and knn. In: 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), IEEE, pp 479–484
Rodríguez-Fdez I, Canosa A, Mucientes M, et al (2015) Stac: a web platform for the comparison of algorithms using statistical tests. In: 2015 IEEE international conference on fuzzy systems (FUZZ-IEEE), IEEE, pp 1–8
Scherf M, Brauer W (1997) Feature selection by means of a feature weighting approach. Citeseer
Su MY (2011) Real-time anomaly detection systems for denial-of-service attacks by weighted k-nearest-neighbor classifiers. Expert Syst Appl 38(4):3492–3498
Sun L, Zhang J, Ding W et al (2022) Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted k-nearest neighbors. Inf Sci 593:591–613
Tang B, He H (2015) Enn: extended nearest neighbor method for pattern recognition [research frontier]. IEEE Comput Intell Mag 10(3):52–60
Tsang IW, Cheung PM, Kwok JT (2005) Kernel relevant component analysis for distance metric learning. In: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005, IEEE, pp 954–959
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(2)
Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34
Witten IH, Frank E, Hall MA, et al (2016) Practical machine learning tools and techniques. In: Data Mining, Morgan Kaufmann, p 4
Xie P, Xing E (2014) Large scale distributed distance metric learning. arXiv preprint arXiv:1412.5949
Yang W, Wang Z, Sun C (2015) A collaborative representation based projections method for feature extraction. Pattern Recogn 48(1):20–27
Yue G, Qu Y, Deng A et al (2023) Neuro-weighted multi-functional nearest-neighbour classification. Expert Syst 40(5):e13125
Zhang C, Liu C, Zhang X et al (2017) An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst Appl 82:128–150
Zhang H, Wang Z, Xia W et al (2022) Weighted adaptive knn algorithm with historical information fusion for fingerprint positioning. IEEE Wirel Commun Lett 11(5):1002–1006
Zhang X, Xiao H, Gao R et al (2022) K-nearest neighbors rule combining prototype selection and local feature weighting for classification. Knowl Based Syst 243:108451
Author information
Authors and Affiliations
Contributions
The authors confirm their contribution to the paper as follows: Study conception and design: YM, MEf, KAB, Data collection: YM, YB, Analysis and interpretation of results: MEf, RF. Draft manuscript preparation: YM, KAB, YB. All authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors report no conflicts of interest or Conflict of interest.
Consent to participate
All authors consent to participate.
Consent for publication
All authors consent to publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Manzali, Y., Barry, K.A., Flouchi, R. et al. A feature weighted K-nearest neighbor algorithm based on association rules. J Ambient Intell Human Comput 15, 2995–3008 (2024). https://doi.org/10.1007/s12652-024-04793-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-024-04793-z