Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2783258.2783299acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Locally Densest Subgraph Discovery

Published: 10 August 2015 Publication History

Abstract

Mining dense subgraphs from a large graph is a fundamental graph mining task and can be widely applied in a variety of application domains such as network science, biology, graph database, web mining, graph compression, and micro-blogging systems. Here a dense subgraph is defined as a subgraph with high density (#.edge / #.node). Existing studies of this problem either focus on finding the densest subgraph or identifying an optimal clique-like dense subgraph, and they adopt a simple greedy approach to find the top-k dense subgraphs. However, their identified subgraphs cannot be used to represent the dense regions of the graph. Intuitively, to represent a dense region, the subgraph identified should be the subgraph with highest density in its local region in the graph. However, it is non-trivial to formally model a locally densest subgraph. In this paper, we aim to discover top-k such representative locally densest subgraphs of a graph. We provide an elegant parameter-free definition of a locally densest subgraph. The definition not only fits well with the intuition, but is also associated with several nice structural properties. We show that the set of locally densest subgraphs in a graph can be computed in polynomial time. We further propose three novel pruning strategies to largely reduce the search space of the algorithm. In our experiments, we use several real datasets with various graph properties to evaluate the effectiveness of our model using four quality measures and a case study. We also test our algorithms on several real web-scale graphs, one of which contains 118.14 million nodes and 1.02 billion edges, to demonstrate the high efficiency of the proposed algorithms.

Supplementary Material

MP4 File (p965.mp4)

References

[1]
B. D. Abrahao, S. Soundarajan, J. E. Hopcroft, and R. Kleinberg. On the separability of structural classes of communities. In Proc. of KDD'12, 2012.
[2]
T. Akiba, Y. Iwata, and Y. Yoshida. Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In Proc. of CIKM'13, 2013.
[3]
R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In WAW, volume 5427 of Lecture Notes in Computer Science. Springer, 2009.
[4]
A. Angel, N. Koudas, N. Sarkas, and D. Srivastava. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. PVLDB, 5(6), 2012.
[5]
Y. Asahiro, R. Hassin, and K. Iwama. Complexity of finding dense subgraphs. Discrete Applied Mathematics, 121(1--3), 2002.
[6]
Y. Asahiro, K. Iwama, H. Tamaki, and T. Tokuyama. Greedily finding a dense subgraph. J. Algorithms, 34(2), 2000.
[7]
B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and mapreduce. PVLDB, 5(5), 2012.
[8]
P. Basaras, D. Katsaros, and L. Tassiulas. Detecting influential spreaders in complex, dynamic networks. IEEE Computer, 46(4), 2013.
[9]
V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049, 2003.
[10]
A. Beutel, W. Xu, V. Guruswami, C. Palow, and C. Faloutsos. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proc. of WWW'13, 2013.
[11]
G. Buehrer and K. Chellapilla. A scalable pattern mining approach to web graph compression with communities. In WSDM'08, 2008.
[12]
L. Chang, J. X. Yu, L. Qin, X. Lin, C. Liu, and W. Liang. Efficiently computing k-edge connected components via graph decomposition. In Proc. of SIGMOD'13, 2013.
[13]
M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In Proc. of APPROX'00, 2000.
[14]
J. Chen and Y. Saad. Dense subgraph extraction with application to community detection. TKDE, 24(7), 2012.
[15]
J. Cheng, Y. Ke, S. Chu, and M. T. Özsu. Efficient core decomposition in massive networks. In Proc. of ICDE'11, 2011.
[16]
J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu. Finding maximal cliques in massive networks. Trans. Database Syst., 36(4), 2011.
[17]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proc. of SODA'02, 2002.
[18]
J. Cohen. Trusses: Cohesive subgraphs for social network analysis. Technique report, National Security Agency, 2008.
[19]
T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson. Introduction to Algorithms. McGraw-Hill, 2001.
[20]
Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW'07, 2007.
[21]
E. Fratkin, B. T. Naughton, D. L. Brutlag, and S. Batzoglou. Motifcut: regulatory motifs finding with maximum density subgraphs. Bioinformatics, 22(14), 2006.
[22]
G. Gallo, M. D. Grigoriadis, and R. E. Tarjan. A fast parametric maximum flow algorithm and applications. J. Comput., 18(1), 1989.
[23]
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In Proc. of VLDB'05, 2005.
[24]
D. F. Gleich and C. Seshadhri. Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In Proc. of KDD'12, 2012.
[25]
A. V. Goldberg. Finding a maximum density subgraph. Technical report, University of California at Berkeley, 1984.
[26]
R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD'09, 2009.
[27]
S. Khuller and B. Saha. On finding dense subgraphs. In Proc. of ICALP'09, 2009.
[28]
V. E. Lee, N. Ruan, R. Jin, and C. C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. Springer, 2010.
[29]
R. Li, J. X. Yu, and R. Mao. Efficient core maintenance in large dynamic graphs. TKDE, 26(10), 2014.
[30]
B. A. Prakash, A. Sridharan, M. Seshadri, S. Machiraju, and C. Faloutsos. Eigenspokes: Surprising patterns and scalable community chipping in large graphs. In Proc. of PAKDD'10, 2010.
[31]
B. Saha, A. Hoch, S. Khuller, L. Raschid, and X. Zhang. Dense subgraphs with restrictions and applications to gene annotation graphs. In Proc. of RECOMB'10, 2010.
[32]
B. Saha, A. Hoch, S. Khuller, L. Raschid, and X. Zhang. Dense subgraphs with restrictions and applications to gene annotation graphs. In Proc. of RECOMB'10, 2010.
[33]
S. E. Schaeffer. Survey: Graph clustering. Comput. Sci. Rev., 1(1), 2007.
[34]
S. B. Seidman. Network structure and minimum degree. Social networks, 5(3), 1983.
[35]
N. Shah, A. Beutel, B. Gallagher, and C. Faloutsos. Spotting suspicious link behavior with fbox: An adversarial perspective. In Proc. of ICDM'14, 2014.
[36]
C. E. Tsourakakis. Mathematical and Algorithmic Analysis of Network and Biological Data. PhD thesis, Carnegie Mellon University, 2013.
[37]
C. E. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. A. Tsiarli. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In Proc. of KDD'13, 2013.
[38]
J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9), 2012.
[39]
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.

Cited By

View all
  • (2024)Conditional Community Search Based on Weight InformationElectronics10.3390/electronics1321432113:21(4321)Online publication date: 4-Nov-2024
  • (2024)Efficient Algorithms for Density Decomposition on Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368197417:11(2933-2945)Online publication date: 30-Aug-2024
  • (2024)A Counting-based Approach for Efficient k-Clique Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/36549222:3(1-27)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. big data
  2. dense subgraph
  3. graph

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation of SZU
  • ARC
  • NSFC

Conference

KDD '15
Sponsor:

Acceptance Rates

KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)5
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Conditional Community Search Based on Weight InformationElectronics10.3390/electronics1321432113:21(4321)Online publication date: 4-Nov-2024
  • (2024)Efficient Algorithms for Density Decomposition on Large Static and Dynamic GraphsProceedings of the VLDB Endowment10.14778/3681954.368197417:11(2933-2945)Online publication date: 30-Aug-2024
  • (2024)A Counting-based Approach for Efficient k-Clique Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/36549222:3(1-27)Online publication date: 30-May-2024
  • (2024)On Density-based Local Community SearchProceedings of the ACM on Management of Data10.1145/36515892:2(1-25)Online publication date: 14-May-2024
  • (2024)A Fast Exact Algorithm to Enumerate Maximal Pseudo-cliques in Large Sparse GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672066(2479-2490)Online publication date: 25-Aug-2024
  • (2024)Unified Dense Subgraph Detection: Fast Spectral Theory Based AlgorithmsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327257436:3(1356-1370)Online publication date: Mar-2024
  • (2024)Efficient and effective algorithms for densest subgraph discovery and maintenanceThe VLDB Journal10.1007/s00778-024-00855-y33:5(1427-1452)Online publication date: 8-May-2024
  • (2023)Efficient and Effective Algorithms for Generalized Densest Subgraph DiscoveryProceedings of the ACM on Management of Data10.1145/35893141:2(1-27)Online publication date: 20-Jun-2023
  • (2023)Scaling Up k-Clique Densest Subgraph DetectionProceedings of the ACM on Management of Data10.1145/35889231:1(1-26)Online publication date: 30-May-2023
  • (2023)Densest Periodic Subgraph Mining on Large Temporal GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.323378835:11(11259-11273)Online publication date: 1-Nov-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media