Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multitask fuzzy Bregman co-clustering approach for clustering data with multisource features

Published: 19 July 2017 Publication History

Abstract

In usual real-world clustering problems, the set of features extracted from the data has two problems which prevent the methods from accurate clustering. First, the features extracted from the samples provide poor information for clustering purpose. Second, the feature vector usually has a high-dimensional multi-source nature, which results in a complex cluster structure in the feature space. In this paper, we propose to use a combination of multi-task clustering and fuzzy co-clustering techniques, to overcome these two problems. In addition, the Bregman divergence is used as the concept of dissimilarity in the proposed algorithm, in order to create a general framework which enables us to use any kind of Bregman distance function, which is consistent with the data distribution and the structure of the clusters. The experimental results indicate that the proposed algorithm can overcome the two mentioned problems, and manages the complexity and weakness of the features, which results in appropriate clustering performances.

References

[1]
J. Zhou, D.S. Wishart, An improved method to detect correct protein folds using partial clustering, BMC Bioinf., 14 (January 2013) 1-12.
[2]
W. Li, L. Fu, B. Niu, S. Wu, J. Wooley, Ultrafast clustering algorithms for metagenomic sequence analysis, Briefings Bioinf., 16 (May 2012) 1-14.
[3]
W.-W. Fan, B. Chen, G. Selvaraj, F.-X. Wu, Discovering biological patterns from short time-series gene expression profiles with integrating PPI data, Neurocomputing, 135 (2014) 3-13.
[4]
G. Ho, W. Ip, C. Lee, W. Mou, Customer grouping for better resources allocation using GA based clustering technique, Expert Syst. Appl., 39 (August 2011) 1979-1987.
[5]
M. Kargari, M.M. Sepehri, Stores clustering using a data mining approach for distributing automotive spare-parts to reduce transportation costs, Expert Syst. Appl., 39 (April 2012) 4740-4748.
[6]
A. Tagarelli, G. Karypis, A segment-based approach to clustering multi-topic documents, Knowl. Inf. Syst., 34 (March 2013) 563-595.
[7]
M. Gong, Y. Liang, J. Shi, W. Ma, Fuzzy C-means clustering with local information and kernel metric for image segmentation, IEEE Trans. Image Proces., 22 (Febuary 2013) 573-584.
[8]
M.I. Lopez, J.M. Luna, C. Romero, S. Ventura, Classification via clustering for predicting final marks based on student participation in forums, in: International Conference on Educational Data Mining Society, 2012.
[9]
D. Pohl, A. Bouchachia, H. Hellwagner, Online indexing and clustering of social media data for emergency management, Neurocomputing, 172 (2016) 168-179.
[10]
A. Liu, Y. Lu, W. Nie, Y. Su, Z. Yang, HEp-2 cells Classification via clustered multi-task learning, Neurocomputing, 195 (2016) 195-201.
[11]
J. Han, X. Ji, X. Hu, J. Han, T. Liu, Clustering and retrieval of video shots based on natural stimulus fMRI, Neurocomputing, 144 (2014) 128-137.
[12]
W. Pedrycz, Collaborative fuzzy clustering, Pattern Recognit. Lett., 23 (December 2002) 1675-1686.
[13]
V. Loia, W. Pedrycz, S. Senatore, Semantic web content analysis: a study in proximity-based collaborative clustering, IEEE Trans. Fuzzy Syst., 15 (December 2007) 1294-1312.
[14]
L. Coletta, E. Hruschka, R. Campello, Collaborative fuzzy clustering algorithms: some refinements and design guidelines, IEEE Trans. Fuzzy Syst., 20 (June 2012) 444-462.
[15]
B. Mandhani, S. Joshi, K. Kummamuru, A matrix density based algorithm to hierarchically co-cluster documents and words, in: International Conference on World Wide Web, 2003.
[16]
W.-C. Tjhi, L. Chen, Dual fuzzy-possibilistic coclustering for categorization of documents, IEEE Trans. Fuzzy Syst., 17 (April 2008) 532-543.
[17]
Y. Yan, L. Chen, W.C. Tjhi, Fuzzy semi-supervised co-clustering for text documents, Fuzzy Sets Syst., 215 (March 2013) 74-89.
[18]
C. Laclau, M. Nadif, Hard and fuzzy diagonal co-clustering for document-term partitioning, Neurocomputing, 193 (2016) 133-147.
[19]
W. Dai, Q. Yang, G.-R. Xue, Y. Yu, Self-taught clustering, in: International Conference on Machine Learning, 2008.
[20]
Q. Gu, J. Zhou, Learning the shared subspace for multi-task clustering and transductive transfer classification, in: IEEE International Conference on Data Mining, 2009.
[21]
Z. Zhang, J. Zhou, Multi-task clustering via domain adaptation, Pattern Recognit., 45 (January 2012) 465-473.
[22]
J. Zhang, C. Zhang, Multitask Bregman clustering, Neurocomputing, 70 (May 2011) 1720-1734.
[23]
X. Zhang, X. Zhang, Smart multi-task Bregman clustering and multi-task kernel clustering, in: Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013.
[24]
T.N. Huy, H. Shao, B. Tong, E. Suzuki, A feature-free and parameter-light multi-task clustering framework, Knowl. Inf. Syst., 36 (July 2013) 251-276.
[25]
M. Hanmandlu, O.P. Verma, S. Susan, V. Madasu, Color segmentation by fuzzy co-clustering of chrominance color features, Neurocomputing, 120 (November 2013) 235-249.
[26]
H. Izakian, W. Pedrycz, Agreement-based fuzzy C-means for clustering data with blocks of features, Neurocomputing, 120 (March 2014) 266-280.
[27]
A. Banerjee, S. Merugu, I.S. Dhillon, J. Ghosh, Clustering with Bregman divergences, J. Mach. Learn. Res., 6 (December 2005) 1705-1749.
[28]
L. Bregman, The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Comput. Math. Math. Phys., 7 (1967) 200-217.
[29]
L. Cayton, Fast Nearest Neighbor Retrieval for Bregman Divergences, in: International Conference on Machine Learning, 2008.
[30]
J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers, Norwell, 1981.
[31]
K. Zou, Z. Wang, M. Hu, An new initialization method for fuzzy c-means algorithm, in: Fuzzy Optimization and Decision Making, 7, 2008, pp. 409-416.
[32]
R. Babuska, Fuzzy Clustering, Delft University of Technology, Delft, the Netherlands, 2001.
[33]
F. Nielsen, R. Nock, Sided and symmetrized Bregman centroids, IEEE Trans. Inf. Theory, 55 (June 2009) 2882-2904.
[34]
A. Strehl, J. Ghosh, Cluster ensembles - a knowledge reuse framework for combining multiple partitions, J. Mach. Learn. Res., 3 (March 2003) 583-617.
[35]
O.T. Yldz, . Aslan, E. Alpaydn, Multivariate statistical tests for comparing classification algorithms, Learn. Intell. Optim., 6683 (2011) 1-15.
[36]
D. Arthur, S. Vassilvitskii, k-means++: the advantages of careful seeding, New Orleans, Louisiana, 2007.
[37]
M. Lichman, UCI Machine Learning Repository {http://Archive.Ics.Uci.Edu/Ml}, University of CaliforniaSchool of Information and Computer Science, Irvine, CA, 2013.
[38]
B. Long, Z. Zhang, P.S. Yu, Co-clustering by block value decomposition, in: The Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 2005.
[39]
I.S. Dhillon, S. Mallela, D.S. Modha, Information-theoretic co-clustering, in: International Conference on Knowledge Discovery and Data Mining, 2003.
[40]
J. Li, T. Li, HCC: a hierarchical co-clustering algorithm, in: International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010.
[41]
W. Cheng, X. Zhang, F. Pan, W. Wang, HICC: an entropy splitting-based framework for hierarchical co-clustering, Knowl. Inf. Syst., 46 (2016) 343-367.
[42]
N. Zheng-Yu, J. Dong-Hong, T. Chew-Lim, Document clustering based on cluster validation, in: ACM International Conference on Information and Knowledge Management, 2004.

Cited By

View all
  1. Multitask fuzzy Bregman co-clustering approach for clustering data with multisource features

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Neurocomputing
    Neurocomputing  Volume 247, Issue C
    July 2017
    224 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 19 July 2017

    Author Tags

    1. Bregman divergence
    2. Fuzzy co-clustering
    3. Multitask clustering

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Hierarchical co-clustering with augmented matrices from external domainsPattern Recognition10.1016/j.patcog.2023.109657142:COnline publication date: 1-Oct-2023
    • (2021)A novel kernelized total Bregman divergence-based fuzzy clustering with local information for image segmentationInternational Journal of Approximate Reasoning10.1016/j.ijar.2021.06.004136:C(281-305)Online publication date: 1-Sep-2021
    • (2020)Multitask possibilistic and fuzzy co-clustering algorithm for clustering data with multisource featuresNeural Computing and Applications10.1007/s00521-018-3851-032:9(4785-4804)Online publication date: 1-May-2020
    • (2018)A novel machine learning framework for diagnosing the type 2 diabetics using temporal fuzzy ant miner decision tree classifier with temporal weighted genetic algorithmComputing10.1007/s00607-018-0599-4100:8(759-772)Online publication date: 1-Aug-2018

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media