Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2798087.2798089acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

Clustering Large Undirected Graphs on External Memory

Published: 07 September 2015 Publication History

Abstract

Traditional graph clustering methods perform poorly on real world power-law graphs out of core. To tackle this challenge, in this paper, we propose an algorithm to cluster such large power law graphs in case of small memory size. In the proposed method, clusters (connected components) are formed by removing top degree nodes (hubs) from the graph. In order to minimize the time for selecting hubs, a divide and conquer strategy is adopted so hubs are selected locally. Compared with the state of art Slashbrun [slashburn2], the proposed algorithm can achieve 30$\times$ faster running time with about 6% more storage cost.

References

[1]
P. Boldi and S. Vigna. The webgraph framework i: compression techniques, 2004.
[2]
F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan. On compressing social networks, 2009.
[3]
T. M. Cover and J. A. Thomas. Elements of information theory. 1991.
[4]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, volume 12, page 2, 2012.
[5]
U. Kang and C. Faloutsos. Beyond 'caveman communities': Hubs and spokes for graph compression and mining. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pages 300--309, Dec 2011.
[6]
Z. Karni and C. Gotsman. Spectral compression of mesh geometry. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 279--286. ACM Press/Addison-Wesley Publishing Co., 2000.
[7]
A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8--10, 2012, pages 31--46, 2012.
[8]
J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.
[9]
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In Proceedings of the 17th International Conference on World Wide Web, WWW 2008, Beijing, China, April 21--25, 2008, pages 695--704, 2008.
[10]
Y. Lim, U. Kang, and C. Faloutsos. Slashburn: Graph compression and mining beyond caveman communities. Knowledge and Data Engineering, IEEE Transactions on, PP(99):1--1, 2014.
[11]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed graphlab: A framework for machine learning in the cloud. PVLDB, 5(8):716--727, 2012.
[12]
A. Lumsdaine, D. Gregor, B. Hendrickson, and J. Berry. Challenges in parallel graph processing. Parallel Processing Letters, 17(1):5--20, 2007 2007.
[13]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135--146. ACM, 2010.
[14]
W. Rao, L. Chen, P. Hui, and S. Tarkoma. Bitlist: New full-text index for low space cost and efficient keyword search. PVLDB, 6(13):1522--1533, 2013.
[15]
A. Roy, I. Mihailovic, and W. Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 472--488. ACM, 2013.
[16]
B. Shao, H. Wang, and Y. Li. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 505--516, New York, NY, USA, 2013. ACM.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HotPlanet '15: Proceedings of the 6th International Workshop on Hot Topics in Planet-Scale Measurement
September 2015
44 pages
ISBN:9781450335348
DOI:10.1145/2798087
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph clustering
  2. power law
  3. social network

Qualifiers

  • Research-article

Funding Sources

  • Science and Technology Commission of Shanghai Municipality

Conference

MobiCom'15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 11 of 20 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 94
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media