Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2594538.2594552acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

The input/output complexity of triangle enumeration

Published: 18 June 2014 Publication History

Abstract

We consider the well-known problem of enumerating all triangles of an undirected graph. Our focus is on determining the input/output (I/O) complexity of this problem. Let E be the number of edges, M<E the size of internal memory, and B the block size. The best results obtained previously are sortE3/2) I/Os (Dementiev, PhD thesis 2006) and O(E2/MB) I/Os (Hu et al., SIGMOD 2013), where sort(n) denotes the number of I/Os for sorting n items. We improve the I/O complexity to O(E3/2/(√MB) expected I/Os, which improves the previous bounds by a factor min(√E/M),√M). Our algorithm is cache-oblivious and also I/O optimal: We show that any algorithm enumerating t distinct triangles must always use Ω(√MB) I/Os, and there are graphs for which t=Ω(E3/2). Finally, we give a deterministic cache-aware algorithm using O(E3/2/√MB) I/Os assuming M > Ec for a constant c > 0. Our results are based on a new color coding technique, which may be of independent interest.

References

[1]
Foto N. Afrati, Anish Das Sarma, Semih Salihoglu, and Jeffrey D. Ullman. Upper and lower bounds on the cost of a Map-Reduce computation. Proc. VLDB Endow., 6(4):277--288, 2013.
[2]
Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Commun. ACM, 31(9), 1988.
[3]
Noga Alon, Oded Goldreich, Johan Hástad, and René Peralta. Simple construction of almost k-wise independent random variables. Random Struct. Algorithms, 3(3):289--304, 1992.
[4]
Lars Arge and Peter Bro Miltersen. On showing lower bounds for external-memory computational geometry problems. In James M. Abello and Jeffrey S. Vitter, editors, External Memory Algorithms, pages 139--159. AMS, 1999.
[5]
Jonathan Berry, Luke Fostvedt, Daniel Nordman, Cynthia Phillips, C. Seshadhri, and Alyson Wilson. Why do simple algorithms for triangle enumeration work in the real world? In Proc. 5th Innovations in Theoretical Computer Science, 2014.
[6]
Jonathan W. Berry, Bruce Hendrickson, Randall A. LaViolette, and Cynthia A. Phillips. Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E, 83:056119, May 2011.
[7]
Gerth S. Brodal and Rolf Fagerberg. On the limits of cache-obliviousness. In Proc. 35th ACM Symposium on Theory of Computing, pages 307--315, 2003.
[8]
Shumo Chu and James Cheng. Triangle listing in massive networks. ACM Trans. Knowl. Discov. Data, 6(4):17:1--17:32, 2012.
[9]
Roman Dementiev. Algorithm engineering for large data sets: hardware, software, algorithms. PhD thesis, Saarland University, 2007.
[10]
Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In Proc. 40th Symposium on Foundations of Computer Science, pages 285--298, 1999.
[11]
Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. ACM Transactions on Algorithms, 8(1):4, 2012.
[12]
Ioannis Fudos and Christoph M. Hoffmann. A graph-constructive approach to solving systems of geometric constraints. ACM Trans. Graph., 16(2):179--216, 1997.
[13]
Xiaocheng Hu, Yufei Tao, and Chin-Wan Chung. I/O-efficient algorithms on triangle listing and counting. Full version under submission (Yufei Tao, personal communication), 2013.
[14]
Xiaocheng Hu, Yufei Tao, and Chin-Wan Chung. Massive graph triangulation. In Proc. ACM SIGMOD International Conference on Management of Data, pages 325--336. ACM, 2013.
[15]
Zahra Jafargholi and Emanuele Viola. 3SUM, 3XOR, Triangles. ArXiv e-prints, May 2013. 1305.3827.
[16]
William Kent. A simple guide to ve normal forms in relational database theory. Commun. ACM, 26(2):120--125, 1983.
[17]
Mihail N. Kolountzakis, Gary L. Miller, Richard Peng, and Charalampos E. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Mathematics, 8(1--2):161--185, 2012.
[18]
Bruno Menegola. An external memory algorithm for listing triangles. Technical report, Federal University of Rio Grande Sul, 2010.
[19]
Rasmus Pagh and Morten Stockel. The Input/Output Complexity of Sparse Matrix Multiplication. ArXiv e-prints, March 2014. 1403.3551.

Cited By

View all
  • (2024)Turing machines with two-level memory: New computational models for analyzing the input/output complexityTheoretical Computer Science10.1016/j.tcs.2023.114347985(114347)Online publication date: Feb-2024
  • (2024)Parallelization of butterfly counting on hierarchical memoryThe VLDB Journal10.1007/s00778-024-00856-x33:5(1453-1484)Online publication date: 7-Jun-2024
  • (2023)I/O-Efficient Butterfly Counting at ScaleProceedings of the ACM on Management of Data10.1145/35887141:1(1-27)Online publication date: 30-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '14: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2014
300 pages
ISBN:9781450323758
DOI:10.1145/2594538
  • General Chair:
  • Richard Hull,
  • Program Chair:
  • Martin Grohe
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cache-aware
  2. cache-oblivious
  3. external memory
  4. lower bound
  5. triangle listing

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'14
Sponsor:

Acceptance Rates

PODS '14 Paper Acceptance Rate 22 of 67 submissions, 33%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Turing machines with two-level memory: New computational models for analyzing the input/output complexityTheoretical Computer Science10.1016/j.tcs.2023.114347985(114347)Online publication date: Feb-2024
  • (2024)Parallelization of butterfly counting on hierarchical memoryThe VLDB Journal10.1007/s00778-024-00856-x33:5(1453-1484)Online publication date: 7-Jun-2024
  • (2023)I/O-Efficient Butterfly Counting at ScaleProceedings of the ACM on Management of Data10.1145/35887141:1(1-27)Online publication date: 30-May-2023
  • (2022)sGrapp: Butterfly Approximation in Streaming GraphsACM Transactions on Knowledge Discovery from Data10.1145/349501116:4(1-43)Online publication date: 8-Jan-2022
  • (2022)A Block-Based Triangle Counting Algorithm on Heterogeneous EnvironmentsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309324033:2(444-458)Online publication date: 1-Feb-2022
  • (2022)Turing Machines with Two-Level Memory: A Deep Look into the Input/Output ComplexityComputing and Combinatorics10.1007/978-3-031-22105-7_18(199-211)Online publication date: 22-Oct-2022
  • (2021)Two-Attribute Skew Free, Isolated CP Theorem, and Massively Parallel JoinsProceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3452021.3458321(166-180)Online publication date: 20-Jun-2021
  • (2021)Theoretically Efficient Parallel Graph Algorithms Can Be Fast and ScalableACM Transactions on Parallel Computing10.1145/34343938:1(1-70)Online publication date: 22-Apr-2021
  • (2020)Improving I/O Complexity of Triangle EnumerationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.3003259(1-1)Online publication date: 2020
  • (2019)Output-Optimal Massively Parallel Algorithms for Similarity JoinsACM Transactions on Database Systems10.1145/331196744:2(1-36)Online publication date: 8-Apr-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media