Abstract
The development of H/W and S/W has shortened the repetition cycle of new data generation and produced various categories of data. Machine learning, in particular, attracts explosive interest as it categorizes and analyzes data through artificial intelligence and contests against man. Once generated, data have their importance highlighted in terms of utilization. It is critical to analyze the data from the past and cluster new data for the utilization of data. The present study thus investigated an algorithm of determining the initial number of clusters automatically, which is part of problems with the K-means algorithm used in data clustering. The study also proposed an approach of optimizing the number of clusters through principal component analysis, a pre-processing process, with the input data for clustering. Its performance evaluation results show the accuracy rate of 87.6% or so.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Madhulatha, T.S.: An overview on clustering methods. J. Intell. Data Anal. 11(6), 719–725 (2007)
Kodinaiya, T.M., Makwana, P.R.: Review on determining number of cluster in K-means clustering. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 1(6), 90–95 (2013)
Zhang, K., Bi, W., Zhang, X., Fu, X., Zhou, K., Zhu, L.: A new kmeans clustering algorithm for point cloud. Int. J. Hybrid Inf. Technol. 8(9), 157–170 (2015)
Anderberg, M.R.: Cluster Analysis for Applications. Academic Press, New York (1973)
Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for K-means clustering. Patter Recogn. Lett. 25, 1293–1302 (2004)
Pena, J., Lozano, J., Larranaga, P.: An empirical comparison of four initialization method for the K-means algorithm. Patter Recogn. Lett. 20, 1027–1040 (1999)
Jung, S.H., Kim, J.C., Sim, C.B.: Prediction data processing scheme using an artificial neural network and data clustering for Big Data. Int. J. Electr. Comput. Eng. 6(1), 330–336 (2016)
Acknowledgments
The research was supported by ‘Area Software Convergence Commercialization Program’, through the Ministry of Science, ICT and Future Planning (S0417161012).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jung, SH., Kim, KJ., Lim, EC., Sim, CB. (2017). A Novel on Automatic K Value for Efficiency Improvement of K-means Clustering. In: Park, J., Chen, SC., Raymond Choo, KK. (eds) Advanced Multimedia and Ubiquitous Engineering. FutureTech MUE 2017 2017. Lecture Notes in Electrical Engineering, vol 448. Springer, Singapore. https://doi.org/10.1007/978-981-10-5041-1_31
Download citation
DOI: https://doi.org/10.1007/978-981-10-5041-1_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5040-4
Online ISBN: 978-981-10-5041-1
eBook Packages: EngineeringEngineering (R0)