research-article

Public Access

Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements

Authors:

Fengguang Song,

Lan LinAuthors Info & Claims

MLHPC'17: Proceedings of the Machine Learning on HPC Environments

Article No.: 9, Pages 1 - 8

https://doi.org/10.1145/3146347.3146357

Published: 12 November 2017 Publication History

Abstract

The k-means clustering method is one of the most widely used techniques in big data analytics. In this paper, we explore the ideas of software blocking, asynchronous local optimizations, and heuristics of simulated annealing to improve the performance of k-means clustering. Like most of the machine learning methods, the performance of k-means clustering relies on two main factors: the computing speed (per iteration), and the convergence rate. A straightforward realization of the software-blocking synchronization-reducing clustering algorithm, however, sees sporadic slower convergence rate than the standard k-means algorithm. To tackle the issues, we design an annealing-enhanced algorithm, which introduces the heuristics of stop conditions and annealing steps to provide as good or better performance than the standard k-means algorithm. This new enhanced k-means clustering algorithm is able to offer the same clustering quality as the standard k-means. Experiments with real-world datasets show that the new parallel implementation is faster than the open source HPC library of Parallel K-Means Data Clustering (e.g., 19% faster on relatively large datasets with 32 CPU cores, and 11% faster on a large dataset with 1,024 CPU cores). Moreover, the extent to which the program performance improves is largely determined by the actual convergence rate of applying the algorithm to different datasets.

References

[1]

David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 1027--1035.

Digital Library

[2]

G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, A. Haidar, T. Herault, J. Kurzak, J. Langou, P. Lemarinier, H. Ltaeif, P. Luszczek, A. YarKhan, and J. Dongarra. 2011. Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA. In Proceedings of the Workshops of the 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPSW 2011). IEEE, 1432--1441.

Digital Library

[3]

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Thomas Herault, and Jack J Dongarra. 2013. PaRSEC: Exploiting Heterogeneity to Enhance Scalability. Computing in Science & Engineering 15, 6 (2013), 36--45.

Digital Library

[4]

Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong Ho, Garth Gibson, and Eric P Xing. 2014. High-performance distributed ML at scale through parameter server consistency models. arXiv preprint arXiv:1410.8043 (2014).

[5]

Inderjit S Dhillon and Dharmendra S Modha. 2002. A data-clustering algorithm on distributed memory multiprocessors. In Large-Scale Parallel Data Mining. Springer, 245--260.

[6]

Reza Farivar, Daniel Rebolledo, Ellick Chan, and Roy H Campbell. 2008. A Parallel Implementation of K-Means Clustering on GPUs. In Pdpta, Vol. 13. 212--312.

[7]

Giuseppe Di Fatta, Francesco Blasa, Simone Cafiero, and Giancarlo Fortino. 2013. Fault tolerant decentralised K-Means clustering for asynchronous large-scale networks. J. Parallel and Distrib. Comput. 73, 3 (2013), 317--329. https://doi.org/10.1016/j.jpdc.2012.09.009 Models and Algorithms for High-Performance Distributed Data Mining.

Digital Library

[8]

Alex Gittens, Aditya Devarakonda, Evan Racah, Michael Ringenburg, Lisa Gerhardt, Jey Kottalam, Jialin Liu, Kristyn Maschhoff, Shane Canon, Jatin Chhugani, et al. 2016. Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+ MPI Using Three Case Studies. arXiv preprint arXiv:1607.01335 (2016).

[9]

John A Hartigan and JA Hartigan. 1975. Clustering algorithms. Vol. 209. Wiley New York.

[10]

Qirong Ho, James Cipar, Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B Gibbons, Garth A Gibson, Greg Ganger, and Eric P Xing. 2013. More effective distributed ML via a stale synchronous parallel parameter server. In Advances in neural information processing systems. 1223--1231.

[11]

Anil K Jain, M Narasimha Murty, and Patrick J Flynn. 1999. Data clustering: a review. ACM computing surveys (CSUR) 31, 3 (1999), 264--323.

[12]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. (2009).

[13]

Monica D Lam, Edward E Rothberg, and Michael E Wolf. 1991. The cache performance and optimizations of blocked algorithms. In ACM SIGARCH Computer Architecture News, Vol. 19. ACM, 63--74.

Digital Library

[14]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.

[15]

Charles E Leiserson. 2010. The Cilk++ Concurrency Platform. The Journal of Supercomputing 51, 3 (2010), 244--257.

Digital Library

[16]

Weikeng Liao. 2017. Parallel K-Means Data Clustering for Large Data Sets. http://www.ece.northwestern.edu/~wkliao/Kmeans/index.html. (2017).

[17]

Stephen Merendino and M Emre Celebi. 2013. A Simulated Annealing Clustering Algorithm Based On Center Perturbation Using Gaussian Mutation. In FLAIRS Conference.

[18]

MLlib. 2017. http://spark.apache.org/mllib/. (2017).

[19]

Gabriela Trazzi Perim, Estefhan Dazzi Wandekokem, and Flávio Miguel Varejão. 2008. K-means initialization methods for improving clustering by simulated annealing. In Ibero-American Conference on Artificial Intelligence. Springer, 133-- 142.

Digital Library

[20]

David Sculley. 2010. Web-scale k-means clustering. In Proceedings of the 19th international conference on World wide web. ACM, 1177--1178.

Digital Library

[21]

Shokri Z Selim and K1 Alsultan. 1991. A simulated annealing algorithm for the clustering problem. Pattern recognition 24, 10 (1991), 1003--1008.

[22]

Fengguang Song, Asim YarKhan, and Jack Dongarra. 2009. Dynamic Task Scheduling for Linear Algebra Algorithms on Distributed-Memory Multicore Systems. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. ACM, New York, NY, USA, 1--11.

Digital Library

[23]

Spark. 2017. http://spark.apache.org/. (2017).

[24]

Fuhui Wu, Qingbo Wu, Yusong Tan, Lifeng Wei, Lisong Shao, and Long Gao. 2013. A vectorized k-means algorithm for intel many integrated core architecture. In International Workshop on Advanced Parallel Processing Technologies. Springer, 277--294.

Digital Library

[25]

Mario Zechner and Michael Granitzer. 2009. Accelerating k-means on the graphics processor via CUDA. In Intensive Applications and Services, 2009. INTENSIVE'09. First International Conference on. IEEE, 7--15.

Digital Library

[26]

Weizhong Zhao, Huifang Ma, and Qing He. 2009. Parallel k-means clustering based on mapreduce. In IEEE International Conference on Cloud Computing. Springer, 674--679.

Digital Library

[27]

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba, and Aude Oliva. 2016. Places: An image database for deep scene understanding. arXiv preprint arXiv:1610.02055 (2016).

Cited By

Zheng WWang DSong F(2020)Designing a parallel Feel-the-Way clustering algorithm on HPC systemsThe International Journal of High Performance Computing Applications10.1177/1094342020975194(109434202097519)Online publication date: 28-Nov-2020
https://doi.org/10.1177/1094342020975194
Chung MPham KThoai NKranzlmueller D(2019)A New Approach for Scheduling Job with the Heterogeneity-Aware Resource in HPC Systems2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00262(1900-1907)Online publication date: Aug-2019
https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00262

Index Terms

Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Computing methodologies
  1. Machine learning
  2. Parallel computing methodologies
    1. Parallel algorithms
      1. Massively parallel algorithms

Recommendations

Outlier Detection Method based on Improved K-means Clustering Algorithm
EITCE '21: Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering

Outlier detection is one of the important methods of data mining and analysis. The previous K-means outlier detection methods are easy to be misled by outliers in the clustering process and cannot detect outliers well. To solve this problem, we propose a ...
Adapting k-means for supervised clustering

k -means is traditionally viewed as an algorithm for the unsupervised clustering of a heterogeneous population into a number of more homogeneous groups of objects. However, it is not necessarily guaranteed to group the same types (classes) of objects ...
A new multiobjective simulated annealing based clustering technique using symmetry

In this paper, we have proposed a multiobjective clustering technique which optimizes simultaneously two objectives, one reflecting the total 'goodness' present in the data set in terms of total compactness (measured using Euclidean distance) of the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MLHPC'17: Proceedings of the Machine Learning on HPC Environments

November 2017

81 pages

ISBN:9781450351379

DOI:10.1145/3146347

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

SC '17

Sponsor:

SIGHPC

SC '17: The International Conference for High Performance Computing, Networking, Storage and Analysis

November 12 - 17, 2017

CO, Denver, USA

Acceptance Rates

Overall Acceptance Rate 5 of 7 submissions, 71%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
151
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)7

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zheng WWang DSong F(2020)Designing a parallel Feel-the-Way clustering algorithm on HPC systemsThe International Journal of High Performance Computing Applications10.1177/1094342020975194(109434202097519)Online publication date: 28-Nov-2020
https://doi.org/10.1177/1094342020975194
Chung MPham KThoai NKranzlmueller D(2019)A New Approach for Scheduling Job with the Heterogeneity-Aware Resource in HPC Systems2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00262(1900-1907)Online publication date: Aug-2019
https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00262

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents