Nothing Special   »   [go: up one dir, main page]

Skip to main content

On Caching for Local Graph Clustering Algorithms

  • Conference paper
AI 2013: Advances in Artificial Intelligence (AI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8272))

Included in the following conference series:

  • 2670 Accesses

Abstract

In recent years, local graph clustering techniques have been utilized as devices to unveil the structured hidden of large networks. With the ever growing size of the data sets generated in domains of applications as diverse as biomedicine and natural language processing, time-efficiency has become a problem of growing importance. We address the improvement of the runtime of local graph clustering algorithms by presenting the novel caching approach SGD ⋆ . This strategy combines the Segmented Least Recently Used and Greedy Dual strategies. By applying different caching strategies to the unprotected and protected segments of a cache, SGD ⋆  displays a superior hitrate and can therewith significantly reduce the runtime of clustering algorithms. We evaluate our approach on four real protein-protein-interaction graphs. Our evaluation shows that SGD ⋆  achieves a considerably higher hitrate than state-of-the-art approaches. In addition, we show how by combining caching strategies with a simple data reordering approach, we can significantly improves the hitrate of state-of-the-art caching strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Brohee, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488–506 (2006)

    Article  Google Scholar 

  2. Dalvi, B.B., Kshirsagar, M., Sudarshan, S.: Keyword search on external memory data graphs. PVLDB 1(1), 1189–1204 (2008)

    Google Scholar 

  3. Schaeffer, S.: Graph clustering. Computer Science Review 1(1), 27–64 (2007)

    Article  MathSciNet  Google Scholar 

  4. Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  5. Satuluri, V., Parthasarathy, S., Ruan, Y.: Local graph sparsification for scalable clustering. In: SIGMOD 2011, pp. 721–732 (2011)

    Google Scholar 

  6. Ngonga Ngomo, A.: Parameter-free clustering of protein-protein interaction graphs. In: Proceedings of Symposium on Machine Learning in Systems Biology 2010 (2010)

    Google Scholar 

  7. Scanniello, G., Marcus, A.: Clustering support for static concept location in source code. In: ICPC, pp. 1–10 (2011)

    Google Scholar 

  8. Karedla, R., Love, J.S., Wherry, B.G.: Caching strategies to improve disk system performance. Computer 27, 38–46 (1994)

    Article  Google Scholar 

  9. Ngonga Ngomo, A.-C., Schumacher, F.: BorderFlow: A local graph clustering algorithm for natural language processing. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 547–558. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Kanjirathinkal, R.C., Sudarshan, S.: Graph clustering for keyword search. In: COMAD (2009)

    Google Scholar 

  12. Kumar, M., Agrawal, K.K., Arora, D.D., Mishra, R.: Implementation and behavioural analysis of graph clustering using restricted neighborhood search algorithm. International Journal of Computer Applications 22(5), 15–20 (2011)

    Article  Google Scholar 

  13. Provost, F., Kolluri, V.: A survey of methods for scaling up inductive algorithms. Data Mining and Knowledge Discovery 3, 131–169 (1999)

    Article  Google Scholar 

  14. O’Neil, E.J., O’Neil, P.E., Weikum, G.: The lru-k page replacement algorithm for database disk buffering. SIGMOD Rec. 22, 297–306 (1993)

    Article  Google Scholar 

  15. Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and zipf-like distributions: Evidence and implications. In: INFOCOM, pp. 126–134 (1999)

    Google Scholar 

  16. Karakostas, G., Serpanos, D.N.: Exploitation of different types of locality for web caches. In: Proceedings of the Seventh International Symposium on Computers and Communications, pp. 207–2012 (2002)

    Google Scholar 

  17. Hou, W.-C., Wang, S.: Size-adjusted sliding window LFU - A new web caching scheme. In: Mayr, H.C., Lazanský, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 567–576. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  18. Arlitt, M., Cherkasova, L., Dilley, J., Friedrich, R., Jin, T.: Evaluating content management techniques for web proxy caches. SIGMETRICS Performance Evaluation Review 27(4), 3–11 (2000)

    Article  Google Scholar 

  19. Tanenbaum, A.S., Woodhull, A.S.: Operating systems - design and implementation, 3rd edn. Pearson Education (2006)

    Google Scholar 

  20. Jin, S., Bestavros, A.: Greedydual* web caching algorithm – exploiting the two sources of temporal locality in web request streams. In: 5th International Web Caching and Content Delivery Workshop, pp. 174–183 (2000)

    Google Scholar 

  21. Schlitter, N., Falkowski, T., Lässig, J.: Dengraph-ho: Density-based hierarchical community detection for explorative visual network analysis. In: Springer (ed.) Proceedings of the 31st SGAI International Conference on Artificial Intelligence (2011)

    Google Scholar 

  22. Schaeffer, S.: Stochastic local clustering for massive graphs. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 354–360. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  23. Felner, A.: Finding optimal solutions to the graph partitioning problem with heuristic search. Ann. Math. Artif. Intell. 45(3-4), 293–322 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  24. Alamgir, M., von Luxburg, U.: Multi-agent random walks for local clustering on graphs. In: ICDM, pp. 18–27 (2010)

    Google Scholar 

  25. Spielman, D.A., Teng, S.H.: A local clustering algorithm for massive graphs and its application to nearly-linear time graph partitioning. CoRR abs/0809.3232 (2008)

    Google Scholar 

  26. Biemann, C., Teresniak, S.: Disentangling from babylonian confusion – unsupervised language identification. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 773–784. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  27. Young, N.E.: On-line file caching. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 82–86 (1998)

    Google Scholar 

  28. Gavin, A.C., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature (January 2006)

    Google Scholar 

  29. Ho, Y., et al.: Systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)

    Article  Google Scholar 

  30. Ito, T., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. U.S.A 98(8), 4569–4574 (2001)

    Article  Google Scholar 

  31. Krogan, N., et al.: Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature (March 2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Speck, R., Ngonga Ngomo, AC. (2013). On Caching for Local Graph Clustering Algorithms. In: Cranefield, S., Nayak, A. (eds) AI 2013: Advances in Artificial Intelligence. AI 2013. Lecture Notes in Computer Science(), vol 8272. Springer, Cham. https://doi.org/10.1007/978-3-319-03680-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03680-9_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03679-3

  • Online ISBN: 978-3-319-03680-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics