Abstract
This paper analyses the data clustering problem from the continuous black-box optimization point of view and proposes methodological guidelines for a standard benchmark of clustering problem instances. Clustering problems have been used many times in the literature to evaluate evolutionary, metaheuristic and other global optimization algorithms. However much of this work has occurred independently and the various experimental methodologies used have produced results which tend to be incomparable and provide little collective wisdom as to the difficulty of the problems used, or an objective measure for comparing and evaluating the performance of algorithms. This paper surveys some of the clustering literature and results to identify issues relevant for benchmarking. A set of 27 problem instances ranging from 4-D to 40-D and based on three well-known datasets is identified. To establish some pilot results on this benchmark set, experiments are presented for the Covariance Matrix Adaptation-Evolution Strategy and several other standard algorithms. A web-repository has also been created for this problem set to facilitate better experimental evaluations of algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases(1998), http://www.ics.uci.edu/~mlearn/MLRepository.html (retrieved)
Brimberg, J., Hansen, P., Mladenovic, N., Taillard, E.D.: Improvements and comparison of heuristics for solving the uncapacitated multisource Weber problem. Operations Research 48(3), 444–460 (2000)
Chang, D.-X., Zhang, X.-D., Zheng, C.-W.: A genetic algorithm with gene rearrangement for k-means clustering. Pattern Recognition 42(7), 1210–1222 (2009)
Du Merle, O., Hansen, P., Jaumard, B., Mladenovic, N.: An interior point algorithm for minimum sum-of-squares clustering. SIAM Journal on Scientific Computing 21(4), 1485–1505 (2000)
Fathian, M., Amiri, B., Maroosi, A.: Application of honey-bee mating optimization algorithm on clustering. Applied Mathematics and Computation 190(2), 1502–1513 (2007)
Kao, Y., Cheng, K.: An ACO-based clustering algorithm. In: Dorigo, M., Gambardella, L.M., Birattari, M., Martinoli, A., Poli, R., Stützle, T. (eds.) ANTS 2006. LNCS, vol. 4150, pp. 340–347. Springer, Heidelberg (2006)
Liu, R., Shen, Z., Jiao, L., Zhang, W.: Immunodomaince based clonal selection clustering algorithm. In: 2010 IEEE Congress on Evolutionary Computation (CEC), pp. 1–7 (2010)
Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recognition 33(9), 1455–1465 (2000)
Salhi, S., Gamal, M.D.H.: A genetic algorithm based approach for the uncapacitated continuous location-allocation problem. Annals of Operations Research 123, 230–222 (2003)
Shelokar, P.S., Jayaraman, V.K., Kulkarni, B.D.: An ant colony approach for clustering. Analytica Chimica Acta 509(2), 187–195 (2004)
Steinley, D.: K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology 59, 1–34 (2006)
Taherdangkoo, M., Shirzadi, M.H., Yazdi, M., Bagheri, M.H.: A robust clustering method based on blind, naked mole-rats (bnmr) algorithm. Swarm and Evolutionary Computation 10, 1–11 (2013)
Xu, R., Wunsch II., D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)
Ye, F., Chen, C.-Y.: Alternative kpso-clustering algorithm. Tamkang Journal of Science and Engineering 8(2), 165 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gallagher, M. (2014). Clustering Problems for More Useful Benchmarking of Optimization Algorithms. In: Dick, G., et al. Simulated Evolution and Learning. SEAL 2014. Lecture Notes in Computer Science, vol 8886. Springer, Cham. https://doi.org/10.1007/978-3-319-13563-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-13563-2_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13562-5
Online ISBN: 978-3-319-13563-2
eBook Packages: Computer ScienceComputer Science (R0)