Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-642-31464-3_29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Parallel community detection for massive graphs

Published: 11 September 2011 Publication History

Abstract

Tackling the current volume of graph-structured data requires parallel tools. We extend our work on analyzing such massive graph data with the first massively parallel algorithm for community detection that scales to current data sizes, scaling to graphs of over 122 million vertices and nearly 2 billion edges in under 7300 seconds on a massively multithreaded Cray XMT. Our algorithm achieves moderate parallel scalability without sacrificing sequential operational complexity. Community detection partitions a graph into subgraphs more densely connected within the subgraph than to the rest of the graph. We take an agglomerative approach similar to Clauset, Newman, and Moore's sequential algorithm, merging pairs of connected intermediate subgraphs to optimize different graph properties. Working in parallel opens new approaches to high performance. On smaller data sets, we find the output's modularity compares well with the standard sequential algorithms.

References

[1]
Bader, D., Gilbert, J., Kepner, J., Koester, D., Loh, E., Madduri, K., Mann, W., Meuse, T.: HPCS SSCA#2 Graph Analysis Benchmark Specifications v1.1 (July 2005).
[2]
Bader, D., Madduri, K.: SNAP, Small-world Network Analysis and Partitioning: an open-source parallel graph framework for the exploration of large-scale networks. In: Proc. Int'l. Parallel and Distributed Processing Symp. (IPDPS 2008), Miami, FL (April 2008).
[3]
Bader, D., McCloskey, J.: Modularity and graph algorithms (September 2009), presented at UMBC.
[4]
Berry, J., Hendrickson, B., LaViolette, R., Phillips, C.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83, 056119 (2011).
[5]
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 (2008).
[6]
Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowledge and Data Engineering 20(2), 172-188 (2008).
[7]
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: A recursive model for graph mining. In: Proc. 4th SIAM Intl. Conf. on Data Mining (SDM). SIAM, Orlando (2004).
[8]
Clauset, A., Newman, M., Moore, C.: Finding community structure in very large networks. Physical Review E 70(6), 66111 (2004).
[9]
Facebook, Inc.: User statistics (October 2011), http://www.facebook.com/press/info.php?statistics
[10]
Fortunato, S., Barthélemy, M.: Resolution limit in community detection. Proc. of the National Academy of Sciences 104(1), 36-41 (2007).
[11]
Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75-174 (2010).
[12]
Gehweiler, J., Meyerhenke, H.: A distributed diffusive heuristic for clustering a virtual P2P supercomputer. In: Proc. 7th High-Performance Grid Computing Workshop (HGCW 2010) in Conjunction with 24th Intl. Parallel and Distributed Processing Symposium (IPDPS 2010). IEEE Computer Society (2010).
[13]
Hoepman, J.H.: Simple distributed weighted matchings. CoRR cs.DC/0410047 (2004).
[14]
Lozano, S., Duch, J., Arenas, A.: Analysis of large social datasets by community detection. The European Physical Journal - Special Topics 143, 257-259 (2007).
[15]
Manne, F., Bisseling, R.: A Parallel Approximation Algorithm for the Weighted Maximum Matching Problem. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 708-717. Springer, Heidelberg (2008).
[16]
Newman, M.: Modularity and community structure in networks. Proc. of the National Academy of Sciences 103(23), 8577-8582 (2006).
[17]
Newman, M., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004).
[18]
Noack, A., Rotta, R.: Multi-level Algorithms for Modularity Clustering. In: Vahrenhold, J. (ed.) SEA 2009. LNCS, vol. 5526, pp. 257-268. Springer, Heidelberg (2009).
[19]
Novick, M.B.: Fast parallel algorithms for the modular decomposition. Tech. rep., Cornell University, Ithaca, NY, USA (1989).
[20]
NYSE Euronext: Consolidated volume in NYSE listed issues, 2010 - current (March 2011), http://www.nyxdata.com/nysedata/asp/factbook/viewer_edition.asp? mode=table&key=3139&category=3
[21]
Preis, R.: Linear Time 1/2-Approximation Algorithm for Maximum Weighted Matching in General Graphs. In: Meinel, C., Tison, S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 259-269. Springer, Heidelberg (1999).
[22]
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. of the National Academy of Sciences 101(9), 2658 (2004).
[23]
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., Barabási, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297(5586), 1551-1555 (2002).
[24]
Twitter, Inc.: Happy birthday Twitter! (March 2011), http://blog.twitter.com/2011/03/happy-birthday-twitter.html
[25]
Wakita, K., Tsurumi, T.: Finding community structure in mega-scale social networks. CoRR abs/cs/0702048 (2007).
[26]
Wilkinson, D.M., Huberman, B.A.: A method for finding communities of related genes. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5241-5248 (2004).
[27]
Zhang, Y., Wang, J., Wang, Y., Zhou, L.: Parallel community detection on large networks with propinquity dynamics. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 997-1006. ACM, New York (2009).

Cited By

View all
  1. Parallel community detection for massive graphs

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    PPAM'11: Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
    September 2011
    764 pages
    ISBN:9783642314636
    • Editors:
    • Roman Wyrzykowski,
    • Jack Dongarra,
    • Konrad Karczewski,
    • Jerzy Waśniewski

    Sponsors

    • INTEL: Intel Corporation
    • Microsoft Corporation: Microsoft Corporation
    • AMD: Advanced Micro Devices
    • HP: Hewlett-Packard Company
    • IBM Corporation

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 11 September 2011

    Author Tags

    1. community detection
    2. graph analysis
    3. parallel algorithm

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A PGAS-Based Implementation for the Parallel Minimum Spanning Tree AlgorithmLarge-Scale Scientific Computing10.1007/978-3-030-97549-4_49(431-438)Online publication date: 7-Jun-2021
    • (2018)A Parallel Approach to Detect Communities in Evolving NetworksBig Data Analytics10.1007/978-3-030-04780-1_13(188-203)Online publication date: 18-Dec-2018
    • (2017)Accelerating Graph Community Detection with Approximate Updates via an Energy-Efficient NoCProceedings of the 54th Annual Design Automation Conference 201710.1145/3061639.3062194(1-6)Online publication date: 18-Jun-2017
    • (2017)Scalable and Efficient Flow-Based Community Detection for Large-Scale Graph AnalysisACM Transactions on Knowledge Discovery from Data10.1145/299278511:3(1-30)Online publication date: 6-Mar-2017
    • (2016)High-Performance and Energy-Efficient Network-on-Chip Architectures for Graph AnalyticsACM Transactions on Embedded Computing Systems10.1145/296102715:4(1-26)Online publication date: 1-Sep-2016
    • (2015)High performance and energy efficient wireless NoC-enabled multicore architectures for graph analyticsProceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems10.5555/2830689.2830708(147-156)Online publication date: 4-Oct-2015
    • (2015)Parallelizing SLPA for scalable overlapping community detectionScientific Programming10.1155/2015/4613622015(4-4)Online publication date: 1-Jan-2015
    • (2015)GossipMapProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807668(1-12)Online publication date: 15-Nov-2015
    • (2015)Branch-Avoiding Graph AlgorithmsProceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures10.1145/2755573.2755580(212-223)Online publication date: 13-Jun-2015
    • (2015)Parallel heuristics for scalable community detectionParallel Computing10.1016/j.parco.2015.03.00347:C(19-37)Online publication date: 1-Aug-2015
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media