Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3146347.3146357acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Public Access

Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements

Published: 12 November 2017 Publication History

Abstract

The k-means clustering method is one of the most widely used techniques in big data analytics. In this paper, we explore the ideas of software blocking, asynchronous local optimizations, and heuristics of simulated annealing to improve the performance of k-means clustering. Like most of the machine learning methods, the performance of k-means clustering relies on two main factors: the computing speed (per iteration), and the convergence rate. A straightforward realization of the software-blocking synchronization-reducing clustering algorithm, however, sees sporadic slower convergence rate than the standard k-means algorithm. To tackle the issues, we design an annealing-enhanced algorithm, which introduces the heuristics of stop conditions and annealing steps to provide as good or better performance than the standard k-means algorithm. This new enhanced k-means clustering algorithm is able to offer the same clustering quality as the standard k-means. Experiments with real-world datasets show that the new parallel implementation is faster than the open source HPC library of Parallel K-Means Data Clustering (e.g., 19% faster on relatively large datasets with 32 CPU cores, and 11% faster on a large dataset with 1,024 CPU cores). Moreover, the extent to which the program performance improves is largely determined by the actual convergence rate of applying the algorithm to different datasets.

References

[1]
David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 1027--1035.
[2]
G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemarinier, H. Ltaeif, P. Luszczek, A. YarKhan, and J. Dongarra. 2011. Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA. In Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPSW 2011). IEEE, 1432--1441.
[3]
George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Thomas Herault, and Jack J Dongarra. 2013. PaRSEC: Exploiting Heterogeneity to Enhance Scalability. Computing in Science & Engineering 15, 6 (2013), 36--45.
[4]
Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong Ho, Garth Gibson, and Eric P Xing. 2014. High-performance distributed ML at scale through parameter server consistency models. arXiv preprint arXiv:1410.8043 (2014).
[5]
Inderjit S Dhillon and Dharmendra S Modha. 2002. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining. Springer, 245--260.
[6]
Reza Farivar, Daniel Rebolledo, Ellick Chan, and Roy H Campbell. 2008. A Parallel Implementation of K-Means Clustering on GPUs. In Pdpta, Vol. 13. 212--312.
[7]
Giuseppe Di Fatta, Francesco Blasa, Simone Cafiero, and Giancarlo Fortino. 2013. Fault tolerant decentralised K-Means clustering for asynchronous large-scale networks. J. Parallel and Distrib. Comput. 73, 3 (2013), 317--329. https://doi.org/10.1016/j.jpdc.2012.09.009 Models and Algorithms for High-Performance Distributed Data Mining.
[8]
Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, et al. 2016. Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+ MPI Using Three Case Studies. arXiv preprint arXiv:1607.01335 (2016).
[9]
John A Hartigan and JA Hartigan. 1975. Clustering algorithms. Vol. 209. Wiley New York.
[10]
Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B Gibbons, Garth A Gibson, Greg Ganger, and Eric P Xing. 2013. More effective distributed ML via a stale synchronous parallel parameter server. In Advances in neural information processing systems. 1223--1231.
[11]
Anil K Jain, M Narasimha Murty, and Patrick J Flynn. 1999. Data clustering: a review. ACM computing surveys (CSUR) 31, 3 (1999), 264--323.
[12]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. (2009).
[13]
Monica D Lam, Edward E Rothberg, and Michael E Wolf. 1991. The cache performance and optimizations of blocked algorithms. In ACM SIGARCH Computer Architecture News, Vol. 19. ACM, 63--74.
[14]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.
[15]
Charles E Leiserson. 2010. The Cilk++ Concurrency Platform. The Journal of Supercomputing 51, 3 (2010), 244--257.
[16]
Weikeng Liao. 2017. Parallel K-Means Data Clustering for Large Data Sets. http://www.ece.northwestern.edu/~wkliao/Kmeans/index.html. (2017).
[17]
Stephen Merendino and M Emre Celebi. 2013. A Simulated Annealing Clustering Algorithm Based On Center Perturbation Using Gaussian Mutation. In FLAIRS Conference.
[18]
MLlib. 2017. http://spark.apache.org/mllib/. (2017).
[19]
Gabriela Trazzi Perim, Estefhan Dazzi Wandekokem, and Flávio Miguel Varejão. 2008. K-means initialization methods for improving clustering by simulated annealing. In Ibero-American Conference on Artificial Intelligence. Springer, 133-- 142.
[20]
David Sculley. 2010. Web-scale k-means clustering. In Proceedings of the 19th international conference on World wide web. ACM, 1177--1178.
[21]
Shokri Z Selim and K1 Alsultan. 1991. A simulated annealing algorithm for the clustering problem. Pattern recognition 24, 10 (1991), 1003--1008.
[22]
Fengguang Song, Asim YarKhan, and Jack Dongarra. 2009. Dynamic Task Scheduling for Linear Algebra Algorithms on Distributed-Memory Multicore Systems. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. ACM, New York, NY, USA, 1--11.
[23]
Spark. 2017. http://spark.apache.org/. (2017).
[24]
Fuhui Wu, Qingbo Wu, Yusong Tan, Lifeng Wei, Lisong Shao, and Long Gao. 2013. A vectorized k-means algorithm for intel many integrated core architecture. In International Workshop on Advanced Parallel Processing Technologies. Springer, 277--294.
[25]
Mario Zechner and Michael Granitzer. 2009. Accelerating k-means on the graphics processor via CUDA. In Intensive Applications and Services, 2009. INTENSIVE'09. First International Conference on. IEEE, 7--15.
[26]
Weizhong Zhao, Huifang Ma, and Qing He. 2009. Parallel k-means clustering based on mapreduce. In IEEE International Conference on Cloud Computing. Springer, 674--679.
[27]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba, and Aude Oliva. 2016. Places: An image database for deep scene understanding. arXiv preprint arXiv:1610.02055 (2016).

Cited By

View all
  • (2020)Designing a parallel Feel-the-Way clustering algorithm on HPC systemsThe International Journal of High Performance Computing Applications10.1177/1094342020975194(109434202097519)Online publication date: 28-Nov-2020
  • (2019)A New Approach for Scheduling Job with the Heterogeneity-Aware Resource in HPC Systems2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00262(1900-1907)Online publication date: Aug-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MLHPC'17: Proceedings of the Machine Learning on HPC Environments
November 2017
81 pages
ISBN:9781450351379
DOI:10.1145/3146347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. High performance computing
  2. machine learning
  3. manycores
  4. simulated annealing
  5. synchronization-reducing clustering algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SC '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 5 of 7 submissions, 71%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)7
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Designing a parallel Feel-the-Way clustering algorithm on HPC systemsThe International Journal of High Performance Computing Applications10.1177/1094342020975194(109434202097519)Online publication date: 28-Nov-2020
  • (2019)A New Approach for Scheduling Job with the Heterogeneity-Aware Resource in HPC Systems2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00262(1900-1907)Online publication date: Aug-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media