Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Improved Approximation Algorithms for Bipartite Correlation Clustering

Published: 01 January 2012 Publication History

Abstract

In this work we study the problem of bipartite correlation clustering (BCC), a natural bipartite counterpart of the well-studied correlation clustering (CC) problem [N. Bansal, A. Blum, and S. Chawla, Machine Learning, 56 (2004), pp. 89--113], also referred to as graph editing [R. Shamir, R. Sharan, and D. Tsur, Discrete Appl. Math., 144 (2004), pp. 173--182]. Given a bipartite graph, the objective of BCC is to generate a set of vertex disjoint bicliques (clusters) that minimizes the symmetric difference to the original graph. The best-known approximation algorithm for BCC due to Amit [N. Amit, The Bicluster Graph Editing Problem, Master's Thesis, Tel Aviv University, Tel Aviv, Israel, 2004] guarantees an $11$-approximation ratio. In this paper we present two algorithms. The first is a linear program based $4$-approximation algorithm. Like the previous approximation algorithm, it requires solving a large convex problem, which becomes prohibitive even for modestly sized tasks. The second algorithm, and our main contribution, is a simple randomized combinatorial algorithm. It also achieves an expected $4$-approximation factor, and it is trivial to implement and highly scalable. The analysis extends a method developed by Ailon, Charikar, and Newman in 2008, where a randomized pivoting algorithm was analyzed for obtaining a $3$-approximation algorithm for CC. For analyzing our algorithm for BCC, considerably more sophisticated arguments are required in order to take advantage of the bipartite structure. Whether it is possible to achieve (or beat) the $4$-approximation factor using a scalable and deterministic algorithm remains an open problem.

References

[1]
P. Symeonidis, A. Nanopoulos, A. Papadopoulos, and Y. Manolopoulos, Nearest-biclusters collaborative filtering, in Workshop on Web Mining and Web Usage, in conjunction with the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), ACM, New York, 2006.
[2]
N. Amit, The Bicluster Graph Editing Problem, Master's Thesis, Tel Aviv University, Tel Aviv, Israel, 2004.
[3]
S. C. Madeira and A. L. Oliveira, Biclustering algorithms for biological data analysis: A survey, IEEE/ACM Trans. Comput. Biol. Bioinformatics, 1 (2004), pp. 24--45.
[4]
Y. Cheng and G. M. Church, Biclustering of expression data, in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Palo Alto, CA, 2000, pp. 93--103.
[5]
N. Bansal, A. Blum, and S. Chawla, Correlation clustering, Machine Learning, 56 (2004), pp. 89--113.
[6]
R. Shamir, R. Sharan, and D. Tsur, Cluster graph modification problems, Discrete Appl. Math., 144 (2004), pp. 173--182.
[7]
X. Z. Fern and C. E. Brodley, Solving cluster ensemble problems by bipartite graph partitioning, in Proceedings of the 21st International Conference on Machine Learning, ICML '04, ACM, New York, 2004, pp. 36--43.
[8]
H. Zha, X. He, C. Ding, H. Simon, and M. Gu, Bipartite graph partitioning and data clustering, in Proceedings of the 10th International Conference on Information and Knowledge Management, CIKM '01, ACM, New York, 2001, pp. 25--32.
[9]
E. D. Demaine, D. Emanuel, A. Fiat, and N. Immorlica, Correlation clustering in general weighted graphs, Theoret. Comput. Sci., 361 (2006), pp. 172--187.
[10]
M. Charikar, V. Guruswami, and A. Wirth, Clustering with qualitative information, J. Comput. System Sci., 71 (2005), pp. 360--383.
[11]
N. Ailon, M. Charikar, and A. Newman, Aggregating inconsistent information: Ranking and clustering, J. ACM, 55 (2008), pp. 1--27.
[12]
N. Ailon and E. Liberty, Correlation clustering revisited: The “true" cost of error minimization problems, in Proceedings of the 36th International Colloquium on Automata, Languages and Programming, ICALP '09, Springer-Verlag, Berlin, Heidelberg, 2009, pp. 24--36.
[13]
A. van Zuylen and D. P. Williamson, Deterministic pivoting algorithms for constrained ranking and clustering problems, Math. Oper. Res., 34 (2009), pp. 594--620.
[14]
I. Giotis and V. Guruswami, Correlation clustering with a fixed number of clusters, in Proceedings of the 17th Annual ACM--SIAM Symposium on Discrete Algorithms (SODA), ACM, New York, SIAM, Philadelphia, 2006, pp. 1167--1176.
[15]
M. Karpinski and W. Schudy, Linear time approximation schemes for the Gale-Berlekamp game and related minimization problems, in Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), ACM, New York, 2009, pp. 313--322.
[16]
J. Guo, F. Hüffner, C. Komusiewicz, and Y. Zhang, Improved algorithms for bicluster editing, in Proceedings of the 5th International Conference on Theory and Applications of Models of Computation, TAMC'08, Springer-Verlag, Berlin, Heidelberg, 2008, pp. 445--456.

Cited By

View all
  • (2024)Improved kernelization and fixed-parameter algorithms for bicluster editingJournal of Combinatorial Optimization10.1007/s10878-024-01186-y47:5Online publication date: 3-Jul-2024
  • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
  • (2023)Approximation algorithms for the capacitated correlation clustering problem with penaltiesJournal of Combinatorial Optimization10.1007/s10878-022-00930-645:1Online publication date: 1-Jan-2023
  • Show More Cited By

Index Terms

  1. Improved Approximation Algorithms for Bipartite Correlation Clustering
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image SIAM Journal on Computing
            SIAM Journal on Computing  Volume 41, Issue 5
            Special Section on the Forty-Third Annual ACM Symposium on Theory of Computing (STOC 2011)
            2012
            247 pages
            ISSN:0097-5397
            DOI:10.1137/smjcat.41.5
            Issue’s Table of Contents

            Publisher

            Society for Industrial and Applied Mathematics

            United States

            Publication History

            Published: 01 January 2012

            Author Tags

            1. bipartite correlation clustering
            2. bipartite graph editing

            Author Tag

            1. 68Q25

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 13 Nov 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Improved kernelization and fixed-parameter algorithms for bicluster editingJournal of Combinatorial Optimization10.1007/s10878-024-01186-y47:5Online publication date: 3-Jul-2024
            • (2023)Faster Approximation Algorithms for Parameterized Graph Clustering and Edge LabelingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614878(78-87)Online publication date: 21-Oct-2023
            • (2023)Approximation algorithms for the capacitated correlation clustering problem with penaltiesJournal of Combinatorial Optimization10.1007/s10878-022-00930-645:1Online publication date: 1-Jan-2023
            • (2022)Efficient and effective optimal transport-based biclusteringProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602660(32989-33000)Online publication date: 28-Nov-2022
            • (2022)Approximation algorithms for the lower bounded correlation clustering problemJournal of Combinatorial Optimization10.1007/s10878-022-00976-645:1Online publication date: 31-Dec-2022
            • (2022)Approximation algorithms for two variants of correlation clustering problemJournal of Combinatorial Optimization10.1007/s10878-020-00612-143:5(933-952)Online publication date: 1-Jul-2022
            • (2021)Approximation Algorithms for the Lower Bounded Correlation Clustering ProblemComputational Data and Social Networks10.1007/978-3-030-91434-9_4(39-49)Online publication date: 15-Nov-2021
            • (2020)Parameterized Correlation Clustering in Hypergraphs and Bipartite GraphsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403238(1868-1876)Online publication date: 23-Aug-2020
            • (2020)Approximation Algorithm for the Balanced 2-correlation Clustering Problem on Well-Proportional GraphsAlgorithmic Aspects in Information and Management10.1007/978-3-030-57602-8_9(97-107)Online publication date: 10-Aug-2020
            • (2019)Approximation Algorithm for the Correlation Clustering Problem with Non-uniform Hard Constrained Cluster SizesAlgorithmic Aspects in Information and Management10.1007/978-3-030-27195-4_15(159-168)Online publication date: 6-Aug-2019
            • Show More Cited By

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media