A Novel on Automatic K Value for Efficiency Improvement of K-means Clustering

Se-Hoon Jung⁴,
Kyoung-Jong Kim⁵,
Eun-Cheon Lim⁶ &
…
Chun-Bo Sim⁷

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 448))

Included in the following conference series:

2441 Accesses

Abstract

The development of H/W and S/W has shortened the repetition cycle of new data generation and produced various categories of data. Machine learning, in particular, attracts explosive interest as it categorizes and analyzes data through artificial intelligence and contests against man. Once generated, data have their importance highlighted in terms of utilization. It is critical to analyze the data from the past and cluster new data for the utilization of data. The present study thus investigated an algorithm of determining the initial number of clusters automatically, which is part of problems with the K-means algorithm used in data clustering. The study also proposed an approach of optimizing the number of clusters through principal component analysis, a pre-processing process, with the input data for clustering. Its performance evaluation results show the accuracy rate of 87.6% or so.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

K-Means and BIRCH: A Comparative Analysis Study

A Hybrid K-Means Algorithm Combining Preprocessing-Wise and Centroid Based-Criteria for High Dimension Datasets

A Comparative Study on k-means Clustering Method and Analysis

References

Madhulatha, T.S.: An overview on clustering methods. J. Intell. Data Anal. 11(6), 719–725 (2007)
Google Scholar
Kodinaiya, T.M., Makwana, P.R.: Review on determining number of cluster in K-means clustering. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 1(6), 90–95 (2013)
Google Scholar
Zhang, K., Bi, W., Zhang, X., Fu, X., Zhou, K., Zhu, L.: A new kmeans clustering algorithm for point cloud. Int. J. Hybrid Inf. Technol. 8(9), 157–170 (2015)
Article Google Scholar
Anderberg, M.R.: Cluster Analysis for Applications. Academic Press, New York (1973)
MATH Google Scholar
Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for K-means clustering. Patter Recogn. Lett. 25, 1293–1302 (2004)
Article Google Scholar
Pena, J., Lozano, J., Larranaga, P.: An empirical comparison of four initialization method for the K-means algorithm. Patter Recogn. Lett. 20, 1027–1040 (1999)
Article Google Scholar
Jung, S.H., Kim, J.C., Sim, C.B.: Prediction data processing scheme using an artificial neural network and data clustering for Big Data. Int. J. Electr. Comput. Eng. 6(1), 330–336 (2016)
Google Scholar

Download references

Acknowledgments

The research was supported by ‘Area Software Convergence Commercialization Program’, through the Ministry of Science, ICT and Future Planning (S0417161012).

Author information

Authors and Affiliations

Department of Multimedia Engineering, GwangYang SW Convergence Institute, Sunchon National University, Suncheon, Korea
Se-Hoon Jung
Research and Development Team, GwangYang SW Convergence Institute, Suncheon, Korea
Kyoung-Jong Kim
Harvard Medical School, Boston, USA
Eun-Cheon Lim
School of Information Communication and Multimedia Engineering, Sunchon National University, Suncheon, Korea
Chun-Bo Sim

Authors

Se-Hoon Jung
View author publications
You can also search for this author in PubMed Google Scholar
Kyoung-Jong Kim
View author publications
You can also search for this author in PubMed Google Scholar
Eun-Cheon Lim
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Bo Sim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun-Bo Sim .

Editor information

Editors and Affiliations

Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
James J. (Jong Hyuk) Park
School of Computing and Information Sciences, Florida International University, Miami, Florida, USA
Shu-Ching Chen
Department of Information Systems and Cyber Security, The University of Texas at San Antonio, Adelaide, Australia
Kim-Kwang Raymond Choo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, SH., Kim, KJ., Lim, EC., Sim, CB. (2017). A Novel on Automatic K Value for Efficiency Improvement of K-means Clustering. In: Park, J., Chen, SC., Raymond Choo, KK. (eds) Advanced Multimedia and Ubiquitous Engineering. FutureTech MUE 2017 2017. Lecture Notes in Electrical Engineering, vol 448. Springer, Singapore. https://doi.org/10.1007/978-981-10-5041-1_31

Download citation

DOI: https://doi.org/10.1007/978-981-10-5041-1_31
Published: 14 May 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5040-4
Online ISBN: 978-981-10-5041-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics