Abstract
Although clustering with pairwise constraints through penalty regularization has been widely adopted in existing semi-supervised clustering approaches, little work has been done on theoretical comparison of these pairwise constrained approaches with respect to difference in penalties. In this paper, we first propose two types of penalties in the context of pairwise constrained fuzzy clustering. The first one accounts for the overall consistency of assignments in terms of fuzzy memberships regarding constrained pairs. The second one is the total Euclidean distance between membership vectors of constrained pairs. After analytical discussion, we establish the connection between different penalties to provide a unified view as well as a better understanding of each of them. Following the idea of penalty regularization, variants of pairwise constrained fuzzy c-means are formulated by incorporating the consistency-type and distance-type penalties, respectively, into the objective function of fuzzy c-means. We also extend this idea to co-clustering by considering pairwise constraints of two types of objects to produce fuzzy co-clusters. Efficient and scalable algorithms have been proposed for parallel implementation. The experimental results with real-world datasets show good performance of the proposed approaches with respect to effectiveness, efficiency and scalability.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Majhi, S.K., Bhatachharya, S., Pradhan, R., Biswal, S.: Fuzzy clustering using salp swarm algorithm for automobile insurance fraud detection. J. Intell. Fuzzy Syst. 36(3), 2333–2344 (2019)
Thao, N.X., Ali, M., Smarandache, F.: An intuitionistic fuzzy clustering algorithm based on a new correlation coefficient with application in medical diagnosis. J. Intell. Fuzzy Syst. 36(1), 189–198 (2019)
Wan, Y., Zhong, Y., Ma, A.: Fully automatic spectral-spatial fuzzy clustering using an adaptive multiobjective memetic algorithm for multispectral imagery. IEEE Trans. Geosci. Remote Sens. 57(4), 2324–2340 (2019)
Wagstaff, K., Cardie, C., Rogers, S., Schrodl, S.: Constrained k-means clustering with background knowledge. In: International Conference on Machine Learning, pp. 577–584 (2001)
Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2004)
Grira, N., Crucianu, M., Boujemaa, N.: Semi-supervised fuzzy clustering with pairwise-constrained competitive agglomeration. In: IEEE International Conference on Fuzzy Systems, pp. 867–872 (2005)
Kummamuru, K., Dhawale, A., Krishnapuram, R.: Fuzzy co-clustering of documents and keywords. In: 12th IEEE International Conference on Fuzzy Systems (2003)
Mei, J.-P., Chen, L.: Proximity-based k-partitions clustering with ranking for document categorization and analysis. Expert Syst. Appl. 41(16), 7095–7105 (2014)
Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybernet. 27(5), 787–795 (1997)
Yasunori, E., Yukihiro, H., Makito, Y.: “On semi-supervised fuzzy c-means clustering,” In: IEEE International Conference on Fuzzy Systems, pp. 1119–1124 (2009)
Mai, D.S., Ngo, L.T.: Semi-supervised fuzzy c-means clustering for change detection from multispectral satellite image. In: IEEE International Conference on Fuzzy Systems (2013)
Marek, S., Oleksandr, M., Jacek, T.: Semi-supervised discriminative clustering with graph regularization. Knowl. Based Syst. 151, 24–36 (2018)
Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recognition 41(5), 1834–1844 (2008)
Frigui, H., Hwang, C.: Fuzzy clustering and aggregation of relational data with instance-level constraints. IEEE Trans. Fuzzy Syst. 16(6), 1565–1581 (2008)
de Melo, F.M., de A.T. de Carvalho, F.: Semi-supervised fuzzy c-medoids clustering algorithm with multiple prototype representation. In: IEEE International Conference on Fuzzy Systems (2013)
Yan, Y., Chen, L.: Fuzzy semi-supervised co-clustering for text documents. Fuzzy Sets Syst. 215, 74–89 (2013)
Bouchachia, A., Pedrycz, W.: Data clustering with partial supervision. Data Min. Knowl. Discov. 12(1), 47–78 (2006)
Yin, X., Shu, T., Huang, Q.: Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl. Based Syst. 35(15), 304–311 (2012)
Lai, D.T.C., Garibaldi, J.M., Reps, J.: Investigating distance metric learning in semi-supervised fuzzy c-means clustering. In: IEEE International Conference on Fuzzy Systems (2014)
Chang, S., Aggarwal, C., Huang, T.: Learning local semantic distances with limited supervision. In: IEEE International Conference on Data Mining, pp. 70–79 (2014)
Diaz-Valenzuela, I., Vila, M.A., Martin-Bautista, M.J.: On the use of fuzzy constraints in semisupervised clustering. IEEE Trans. Fuzzy Syst. 24(4), 992–999 (2016)
Ding, S., Jia, H., Du, M., Xue, Y.: A semi-supervised approximate spectral clustering algorithm based on HMRF model. Inf. Sci. 429, 215–228 (2018)
Kanzawa, Y., Endo, Y., Miyamoto, S.: Some pairwise constrained semi-supervised fuzzy c-means clustering algorithms. In: International Conference on Modeling Decisions for Artificial Intelligence (2009)
Mei, J.-P., Chen, L.: Fuzzy clustering with weighted medoids for relational data. Pattern Recognit. 43(5), 1964–1974 (2010)
Zhao, W., Ma, H., He, Q.: Parallel k-means clustering based on mapreduce. In: International Conference on Cloud Computing, pp. 674–679 (2009)
Yang, Y., Teng, F., Li, T., Wang, H., Wang, H., Zhang, Q.: Parallel semi-supervised multi-ant colonies clustering ensemble based on mapreduce methodology. IEEE Trans. Cloud Comput. 6(3), 857–867 (2018)
Chen, J., Li, K., Tang, Z., Bilal, K., Yu, S., Weng, C., Li, K.: A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 28(4), 919–933 (2017)
Lu, M., Zhao, X.-J., Zhang, L., Li, F.: Semi-supervised concept factorization for document clustering. Inf. Sci. 331, 86–98 (2016)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. 61502420), and the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY16F020032).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mei, JP., Lv, H., Cao, J. et al. Pairwise Constrained Fuzzy Clustering: Relation, Comparison and Parallelization. Int. J. Fuzzy Syst. 21, 1938–1949 (2019). https://doi.org/10.1007/s40815-019-00683-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-019-00683-1