Abstract
The k nearest neighbors (k-NN) classification technique has a worldly wide fame due to its simplicity, effectiveness, and robustness. As a lazy learner, k-NN is a versatile algorithm and is used in many fields. In this classifier, the k parameter is generally chosen by the user, and the optimal k value is found by experiments. The chosen constant k value is used during the whole classification phase. The same k value used for each test sample can decrease the overall prediction performance. The optimal k value for each test sample should vary from others in order to have more accurate predictions. In this study, a dynamic k value selection method for each instance is proposed. This improved classification method employs a simple clustering procedure. In the experiments, more accurate results are found. The reasons of success have also been understood and presented.
Similar content being viewed by others
References
Myatt G (2007) Making sense of data: a practical guide to exploratory data analysis and data mining , p 176–181 Wiley, New York
Bache K, Lichman M (2013) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Ho T, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300
Bhatia N (2010) Survey of nearest neighbor techniques. Int J Comput Sci Inf Secur 8(2):302–305
Jiang L, Cai Z, Wang D, Jiang S (2007) Survey of improving K-nearest-neighbor for classification. In: Fourth international conference on fuzzy systems and knowledge discovery, vol 1, pp 679–683. doi:10.1109/FSKD.2007.552
Miloud-Aouidate A, Baba-Ali AR (2011) Survey of nearest neighbor condensing techniques. Int J Adv Comput Sci Appl 2(11):59–64
Alexandros A, Michael O, Anthony B (2013) Adaptive distance metrics for nearest neighbour classification based on genetic programming. Lecture notes in computer science, vol 7831, Springer, Heidelberg, pp 1–12
Wang J, Neskovic P, Cooper LN (2007) Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recognit Lett 28(2):207–213
Fandos R, Debes C, Zoubir AM (2013) Resampling methods for quality assessment of classifier performance and optimal number of features. Signal Process 93:2956–2968. doi:10.1016/j.sigpro.2013.05.004
Ghosh AK, Chaudhuri P, Murthy CA (2006) Multi-scale classification using nearest neighbor density estimates. IEEE Trans Syst Man Cybern Part B 36(5):1139–1148
Ozger ZB and Amasyali M (2013) KNN parameter selection via meta learning. In: Signal processing and communications applications conference (SIU), Trabzon, Turkey. doi:10.1109/SIU.2013.6531231
Sanchez J, Pla F, Ferri F (1997) On the use of neighbourhood-based non-parametric classifiers. Pattern Recognit Lett 18(11–13):1179–1186
Ghosh AK (2006) On optimum choice of k in nearest neighbor classification. Comput Stat Data Anal 50(11):3113–3123
Guo G, Wang H, Bell D, Bi Y, Greer K (2003) KNN model-based approach in classification. Lecture notes in computer science, vol 2888, pp 986–996. doi:10.1007/978-3-540-39964-3_62
Ghosh AK (2007) On nearest neighbor classification using adaptive choice of k. J Comput Gr Stat 16(2):482–502
Wang H, Nie F, Huang H (2014) Robust distance metric learning via simultaneous l1-norm minimization and maximization. In: Proceedings of the 31st international conference on machine, JMLR: W&CP, vol 32. Beijing, China.
Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognit 41:3600–3612
KD-tree searcher class, MathWorks. www.mathworks.com/help/stats/kdtreesearcher-class.html. Accessed 2014 Oct 22
Weiss MA (2013) Data structures & algorithm analysis in C++, 4th edn. Pearson, London, pp 83–85, 614–618, 629
Myatt G (2007) Making sense of data: a practical guide to exploratory data analysis and data mining. Wiley, New York, pp 120–129
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bulut, F., Amasyali, M.F. Locally adaptive k parameter selection for nearest neighbor classifier: one nearest cluster. Pattern Anal Applic 20, 415–425 (2017). https://doi.org/10.1007/s10044-015-0504-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-015-0504-0