Abstract
Multi-objective clustering refers to the partitioning of a given collection of objects into various K-groups based on some similarity/dissimilarity criterion while optimizing different partition quality measures simultaneously. The current paper proposes an automated decomposition based multi-objective clustering technique, SOMDEA_clust, which is a fusion of self-organizing map (SOM) and multi-objective differential evolution. A novel reproduction operator is designed where the ensemble of multiple neighborhoods extracted using self-organizing map is used for constructing the variable mating pool size. The probabilities of selecting different sizes of the neighborhood are updated based on their performances in generating new improved solutions in the last few generations. Decomposition based selection scheme is also utilized in our paper which divides the multi-objective optimization (MOO) problem into a number of single objective subproblems. The objective functions corresponding to these subproblems are optimized in a collaborative manner by the use of MOO. The potentiality of the proposed framework is shown for clustering four real-life data sets and five artificial data sets in comparison to some existing multi-objective based clustering techniques, namely MOCK, SMEA_clust, MEA_clust, a single objective based genetic clustering technique, SOGA and a traditional clustering technique, K-means. To show the utility of SOM based reproduction operators, another decomposition based multi-objective clustering technique (MDEA_clust) without the use of SOM based operators is also developed in this paper. In order to show the efficacy of the proposed clustering technique in handling large data sets, two large scale datasets having more than 5000 data points are also utilized. As a real-life application, the proposed clustering technique is applied for scientific/web document clustering where a set of scientific/web documents are partitioned based on their content-similarities. Semantic representation is utilized to covert the text document into a real vector. Experimental results clearly illustrate the effectiveness of fusion of SOM and DE in developing an effective clustering technique.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez J M, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256
Bandyopadhyay S, Maulik U (2002) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn 35(6):1197–1208
Bandyopadhyay S, Saha S (2007) Gaps: a clustering method using a new point symmetry-based distance measure. Pattern Recogn 40(12):3430–3451
Bandyopadhyay S, Saha S (2008) A new principal axis based line symmetry measurement and its application to clustering. In: International conference on neural information processing. Springer, pp 543–550
Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457
Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evol Comput 12(3):269–283
Cardoso-Cachopo A (2007) Improving methods for single-label text categorization. PdD Thesis, Instituto Superior Tecnico, Universidade Tecnica de Lisboa
Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI 1(2):224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1(2):224–227
Deb K (2014) Multi-objective optimization. In: Search methodologies. Springer, pp 403–449
Deb K, Tiwari S (2008) Omni-optimizer: a generic evolutionary algorithm for single and multi-objective optimization. Eur J Oper Res 185(3):1062–1087
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197
Giagkiozis I, Purshouse RC, Fleming PJ (2014) Generalized decomposition and cross entropy methods for many-objective optimization. Inf Sci 282:363–387
Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76
Haykin SS (2009) Neural networks and learning machines, vol 3. Prentice-Hall, Pearson Upper Saddle River
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Pearson Upper Saddle River
Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75
Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, pp 760–766
Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6
Li H, Zhang Q (2009) Multiobjective optimization problems with complicated pareto sets, moea/d and nsga-ii. IEEE Trans Evol Comput 13(2):284–302
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Maulik U, Bandyopadhyay S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532–1543
Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, Berlin
Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recogn 43(3):738–751
Saha S, Bandyopadhyay S (2012) Some connectivity based cluster validity indices. Appl Soft Comput 12 (5):1555–1565
Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108
Saini N, Chourasia S, Saha S, Bhattacharyya P (2017) A self organizing map based multi-objective framework for automatic evolution of clusters. In: International conference on neural information processing. Springer, pp 672–682
Saini N, Saha S, Bhattacharyya P (2018) An improved technique for automatic email classification. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Starczewski A (2017) A new validity index for crisp clusters. Pattern Anal Applic 20(3):687–700
Suresh K, Kundu D, Ghosh S, Das S, Abraham A (2009) Data clustering using multi-objective differential evolution algorithms. Fundamenta Informaticae 97(4):381–403
Welch BL (1947) The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika 34(1/2):28–35. http://www.jstor.org/stable/2332510
Zhang H, Zhang X, Gao XZ, Song S (2016) Self-organizing multiobjective optimization based on decomposition with neighborhood ensemble. Neurocomputing 173:1868–1884
Zhang H, Zhou A, Song S, Zhang Q, Gao XZ, Zhang J (2016) A self-organizing multiobjective evolutionary algorithm. IEEE Trans Evol Comput 20(5):792–806. https://doi.org/10.1109/TEVC.2016.2521868
Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Acknowledgments
Dr. Sriparna Saha would like to acknowledge the support of SERB Women in Excellence Award-SB/WEA/08/2017 for carrying out this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saini, N., Saha, S., Harsh, A. et al. Sophisticated SOM based genetic operators in multi-objective clustering framework. Appl Intell 49, 1803–1822 (2019). https://doi.org/10.1007/s10489-018-1350-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1350-8