Abstract
Clustering algorithms with attribute weighting have gained much attention during the last decade. However, they usually optimize a single-objective function that can be a limitation to cope with different kinds of data, especially those with non-hyper-spherical shapes and/or linearly non-separable patterns. In this paper, the multiobjective optimization approach is introduced into the kernel-based attribute-weighted clustering algorithm, in which two objective functions separately considering the intracluster compactness and intercluster separation are optimized simultaneously. Meanwhile, the sampling operation and efficient clustering ensemble method are incorporated with the projection similarity validity index approach to obtain the clustering solution, which can effectively reduce the computing time especially for large data. Experiments on many data sets demonstrate that, the proposed algorithm in general outperforms the existing attribute-weighted algorithms and the computing efficiency for selection of the final solution is improved by a large margin. Moreover, its merit in terms of the partition and cluster interpretation tools is shown.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alok AK, Saha S, Ekbal A (2016) Multi-objective semi-supervised clustering for automatic pixel classification from remote sensing imagery. Soft Comput 20(12):4733–4751
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the 18 annual ACM-SIAM symposium on discrete algorithms, pp 1027–1035
Bai L, Liang J (2014) The k-modes type clustering plus between-cluster information for categorical data. Neurocomputing 133:111–121
Bai L, Liang J, Dang C, Cao F (2011) A novel attribute weighting algorithm for clustering high-dimensional categorical data. Pattern Recognit 44(12):2843–2861
Bai L, Liang J, Dang C, Cao F (2013) A novel fuzzy clustering algorithm with between-cluster information for categorical data. Fuzzy Sets Syst 215:55–73
Benaichouche AN, Oulhadj H, Siarry P (2016) Multiobjective improved spatial fuzzy c-means clustering for image segmentation combining Pareto-optimal clusters. J Heuristics 22(4):383–404
Capitaine HL, Frlicot C (2011) A cluster-validity index combining an overlap measure and a separation measure based on fuzzy-aggregation operators. IEEE Trans Fuzzy Syst 19(3):580–588
Chan EY, Ching WK, Ng MK, Huang JZ (2004) An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognit 37(5):943–952
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol: TIST 2(3):1–27
Chavent M, de Carvalho FA, Lechevallier Y, Verde R (2006) New clustering methods for interval data. Comput Stat 21(2):211–229
Coelho AL, Fernandes E, Faceli K (2010) Inducing multi-objective clustering ensembles with genetic programming. Neurocomputing 74(1):494–498
de Amorim RC, Mirkin B (2012) Minkowski metric, feature weighting and anomalous cluster initializing in K-means clustering. Pattern Recognit 45(3):1061–1075
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):97–182
Deng Z, Choi K-S, Chung F-L, Wang S (2010) Enhanced soft subspace clustering integrating within-cluster and between-cluster information. Pattern Recognit 43(3):767–781
Faceli K, de Souto MC, de Arajo DS, de Carvalho AC (2009) Multi-objective clustering ensemble for gene expression data analysis. Neurocomputing 72(13):2763–2774
Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21 international conference on Machine learning, pp 1–8
Ferreira MR, de Carvalho FA (2014a) Kernel-based hard clustering methods in the feature space with automatic variable weighting. Pattern Recognit 47(9):3082–3095
Ferreira MR, De Carvalho FDA (2014b) Kernel fuzzy c-means with automatic variable weighting. Fuzzy Sets Syst 237:1–46
Ferreira MR, de Carvalho FDA, Simoes EC (2016) Kernel-based hard clustering methods with kernelization of the metric and automatic weighting of the variables. Pattern Recognit 51:310–321
Gan G, Wu J (2008) A convergence theorem for the fuzzy subspace clustering (FSC) algorithm. Pattern Recognit 41(6):1939–1947
Gan G, Ng MK-P (2015) Subspace clustering with automatic feature grouping. Pattern Recognit 48(11):3703–3713
Garcia-Piquer A, Fornells A, Orriols-Puig A, Corral G, Golobardes E (2012) Data classification through an evolutionary approach based on multiple criteria. Knowl Inf Syst 33(1):35–56
Garcia-Piquer A, Fornells A, Bacardit J, Orriols-Puig A, Golobardes E (2014) Large-scale experimental evaluation of cluster representations for multiobjective evolutionary clustering. IEEE Trans Evol Comput 18(1):36–53
Graves D, Pedrycz W (2010) Kernel-based fuzzy clustering and fuzzy clustering: a comparative experimental study. Fuzzy Sets Syst 161(4):522–543
Halkidi M, Vazirgiannis M (2001) Clustering validity assessment: finding the optimal partitioning of a data set. In: Proceedings of the 2001 IEEE international conference on data mining, pp 187–194
Hancer E, Karaboga D (2017) A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number. Swarm Evol Comput 32:49–67
Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):657–668
Huang X, Ye Y, Zhang H (2014a) Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation. IEEE Trans Neural Netw Learn Syst 25(8):1433–1446
Huang X, Ye Y, Guo H, Cai Y, Zhang H, Li Y (2014b) DSKmeans: a new kmeans-type approach to discriminative subspace clustering. Knowl Based Syst 70:293–300
Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1026–1041
Ji J, Wang K-L (2014) A robust nonlocal fuzzy clustering algorithm with between-cluster separation measure for SAR image segmentation. IEEE J Sel Top Appl Earth Obs Remote Sens 7(12):4929–4936
Jos-Garcła A, Gmez-Flores W (2016) Automatic clustering using nature-inspired metaheuristics: a survey. Appl Soft Comput 41:192–213
Li Y, Wei Y, Wang Y, Jiao L (2014) Multi-objective evolutionary for synthetic aperture radar image segmentation with non-local means denoising. Nat Comput 13(1):39–53
Liu R, Zhang L, Li B (2015) Synergy of two mutations based immune multi-objective automatic fuzzy clustering algorithm. Knowl Inf Syst 45(1):133–157
Ma A, Zhong Y, Zhang L (2015) Adaptive multiobjective memetic fuzzy clustering algorithm for remote sensing imagery. IEEE Trans Geosci Remote Sens 53(8):4202–4217
Mukhopadhyay A, Maulik U (2011) A multiobjective approach to MR brain image segmentation. Appl Soft Comput 11(1):872–880
Mukhopadhyay A, Maulik U, Bandyopadhyay S (2009) Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes. IEEE Trans Evol Comput 13(5):991–1005
Mukhopadhyay A, Maulik U, Bandyopadhyay S (2013) An interactive approach to multiobjective clustering of gene expression patterns. IEEE Trans Biomed Eng 60(1):35–41
Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAC (2014) Survey of multiobjective evolutionary algorithms for data mining: part II. IEEE Trans Evol Comput 18(1):20–35
Prakash J, Singh P (2015) An effective multiobjective approach for hard partitional clustering. Memet Comput 7(2):93–104
Sag T, Cunkas M (2015) Color image segmentation based on multiobjective artificial bee colony optimization. Appl Soft Comput 34:389–401
Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108
Saha I, Maulik U (2014) Incremental learning based multiobjective fuzzy clustering for categorical data. Inf Sci 267:35–57
Saha I, Maulik U, Plewczynski D (2011) A new multi-objective technique for differential fuzzy clustering. Appl Soft Comput 11(2):2765–2776
Saha S, Ekbal A, Gupta K, Bandyopadhyay S (2013) Gene expression data clustering using a multiobjective symmetry based clustering technique. Comput Biol Med 43(11):1965–1977
Saha S, Spandana R, Ekbal A, Bandyopadhyay S (2015) Simultaneous feature selection and symmetry based clustering using multiobjective framework. Appl Soft Comput 29:479–486
Saha S, Alok AK, Ekbal A (2016) Brain image segmentation using semi-supervised clustering. Expert Syst Appl 52(15):50–63
Shen H, Yang J, Wang S, Liu X (2006) Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets. Soft Comput 10(11):1061–1073
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B (Stat Methodol) 63(2):411–423
Wang J, Deng Z, Choi K-S, Jiang Y, Luo X, Chung F-L, Wang S (2016) Distance metric learning for soft subspace clustering in composite kernel space. Pattern Recognit 52:113–134
Wikaisuksakul S (2014) A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering. Appl Soft Comput 24:679–691
Wu K-L, Yu J, Yang M-S (2005) A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests. Pattern Recogn Lett 26(5):639–652
Wu C, Ouyang C, Chen L, Lu L (2014) A new fuzzy clustering validity index with a median factor for centroid-based clustering. IEEE Trans Fuzzy Syst 23(3):701–718
Xia H, Zhuang J, Yu D (2013) Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data. Pattern Recognit 46(9):2562–2575
Yang D, Jiao L, Gong M, Liu F (2011) Artificial immune multi-objective SAR image segmentation with fused complementary features. Inf Sci 181(13):2797–2812
Yang C-L, Kuo R, Chien C-H, Quyen NTP (2015) Non-dominated sorting genetic algorithm using fuzzy membership chromosome for categorical data clustering. Appl Soft Comput 18(1):20–35
Zhao F, Liu H, Fan J (2015) A multiobjective spatial fuzzy clustering algorithm for image segmentation. Appl Soft Comput 30:48–57
Zhong Y, Zhang S, Zhang L (2013) Automatic fuzzy clustering based on adaptive multi-objective differential evolution for remote sensing imagery. IEEE J Sel Top Appl Earth Obs Remote Sens 6(99):1–12
Zhou J, Chen L, Chen CLP, Zhang Y, Li H (2016) Fuzzy clustering with the entropy of attribute weights. Neurocomputing 198:34–125
Zhu L, Cao L, Yang J (2012) Multiobjective evolutionary algorithm-based soft subspace clustering. In: Proceedings of the 2012 IEEE international conference on Evolutionary Computation, pp 1–8
Acknowledgements
This study was funded by the Natural Science Foundation of China (Grant No. 61373126) and the Fundamental Research Funds for the Central Universities of China (Grant No. JUSRP51510).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Zhou, Z., Zhu, S. Kernel-based multiobjective clustering algorithm with automatic attribute weighting. Soft Comput 22, 3685–3709 (2018). https://doi.org/10.1007/s00500-017-2590-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2590-y