Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Free access
Just Accepted

Tight Bounds for Monotone Minimal Perfect Hashing

Online AM: 08 August 2024 Publication History

Abstract

The monotone minimal perfect hash function (MMPHF) problem is the following indexing problem. Given a set \(S=\{s_{1},\ldots,s_{n}\}\) of \(n\) distinct keys from a universe \(U\) of size \(u\), create a data structure \(\text{D}\) that answers the following query:
\begin{equation*} \rm{R\small{ANK}}(q) = \begin{cases} \text{rank of } q \text{ in } S & q\in S\\ \text{arbitrary answer} & \text{otherwise.}\end{cases}\end{equation*}
Solutions to the MMPHF problem are in widespread use in both theory and practice.
The best upper bound known for the problem encodes \(\text{D}\) in \(O(n\log\log\log u)\) bits and performs queries in \(O(\log u)\) time. It has been an open problem to either improve the space upper bound or to show that this somewhat odd looking bound is tight.
In this paper, we show the latter: any data structure (deterministic or randomized) for monotone minimal perfect hashing of any collection of \(n\) elements from a universe of size \(u\) requires \(\Omega(n\cdot\log\log\log{u})\) expected bits to answer every query correctly.
We achieve our lower bound by defining a graph \(\mathbf{G}\) where the nodes are the possible \({u\choose n}\) inputs and where two nodes are adjacent if they cannot share the same \(\text{D}\). The size of \(\text{D}\) is then lower bounded by the log of the chromatic number of \(\mathbf{G}\). Finally, we show that the fractional chromatic number (and hence the chromatic number) of \(\mathbf{G}\) is lower bounded by \(2^{\Omega(n\log\log\log u)}\).

References

[1]
Djamal Belazzougui. 2014. Linear time construction of compressed text indices in compact space. In Proceedings of the forty-sixth Annual ACM Symposium on Theory of Computing. 148–193.
[2]
Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. 2009. Monotone minimal perfect hashing: searching a sorted table with O(1) accesses. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 785–794.
[3]
Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. 2011. Theory and practice of monotone minimal perfect hashing. Journal of Experimental Algorithmics (JEA) 16 (2011), 3–1.
[4]
Djamal Belazzougui, Fabio Cunial, Juha Kärkkäinen, and Veli Mäkinen. 2020. Linear-time string indexing and analysis in small space. ACM Transactions on Algorithms (TALG) 16, 2 (2020), 1–54.
[5]
Djamal Belazzougui, Travis Gagie, Veli Mäkinen, and Marco Previtali. 2016. Fully dynamic de Bruijn graphs. In International symposium on string processing and information retrieval. Springer, 145–152.
[6]
Djamal Belazzougui and Gonzalo Navarro. 2014. Alphabet-independent compressed text indexing. ACM Transactions on Algorithms (TALG) 10, 4 (2014), 1–19.
[7]
Djamal Belazzougui and Gonzalo Navarro. 2015. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms (TALG) 11, 4 (2015), 1–21.
[8]
Paolo Boldi. 2015. Minimal and Monotone Minimal Perfect Hash Functions. In Mathematical Foundations of Computer Science 2015 - 40th International Symposium, MFCS 2015, Milan, Italy, August 24-28, 2015, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 9234), Giuseppe F. Italiano, Giovanni Pighizzini, and Donald Sannella (Eds.). Springer, 3–17.
[9]
Paolo Boldi and Sebastiano Vigna. 2008. Sux4J 1.0. (2008).
[10]
Alexandra Boldyreva, Nathan Chenette, and Adam O’Neill. 2011. Order-preserving encryption revisited: Improved security analysis and alternative solutions. In Annual Cryptology Conference. Springer, 578–595.
[11]
Raphaël Clifford, Allyx Fontaine, Ely Porat, Benjamin Sach, and Tatiana Starikovskaya. 2015. Dictionary matching in a stream. In Algorithms-ESA 2015. Springer, 361–372.
[12]
Martin Dietzfelbinger et al. 2018. 4.3 Space Complexity of Monotone Minimal Perfect Hashing. Dagstuhl Reports, Vol. 7, Issue 5 ISSN 2192-5283 (2018), 19.
[13]
Dwight Duffus, Hannon Lefmann, and Vojtěch Rödl. 1995. Shift graphs and lower bounds on Ramsey numbers rk (l; r). Discrete Mathematics 137, 1-3 (1995), 177–187.
[14]
Paul Erdős and András Hajnal. 1966. On chromatic number of graphs and set-systems. Acta Math. Acad. Sci. Hungar 17, 61-99 (1966), 1.
[15]
Michael L Fredman and János Komlós. 1984. On the size of separating systems and families of perfect hash functions. SIAM Journal on Algebraic Discrete Methods 5, 1 (1984), 61–68.
[16]
Zoltan Füredi, Péter Hajnal, Vojtech Rödl, and William T Trotter. 1992. Interval orders and shift graphs. (1992).
[17]
Travis Gagie, Gonzalo Navarro, and Nicola Prezza. 2020. Fully functional suffix trees and optimal text searching in BWT-runs bounded space. Journal of the ACM (JACM) 67, 1 (2020), 1–54.
[18]
Roberto Grossi, Alessio Orlandi, and Rajeev Raman. 2010. Optimal trade-offs for succinct string indexes. In International Colloquium on Automata, Languages, and Programming. Springer, 678–689.
[19]
Torben Hagerup and Torsten Tholey. 2001. Efficient Minimal Perfect Hashing in Nearly Minimal Space. In STACS 2001, 18th Annual Symposium on Theoretical Aspects of Computer Science, Dresden, Germany, February 15-17, 2001, Proceedings (Lecture Notes in Computer Science, Vol. 2010), Afonso Ferreira and Horst Reichel (Eds.). Springer, 317–326.
[20]
Dmitry Kosolobov. 2024. Simplified Tight Bounds for Monotone Minimal Perfect Hashing. arXiv preprint arXiv:2403.07760 (2024).
[21]
Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. 1–13.
[22]
Kurt Mehlhorn. 1982. On the program size of perfect and universal hash functions. In 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982). IEEE, 170–175.
[23]
Gonzalo Navarro. 2014. Spaces, trees, and colors: The algorithmic landscape of document retrieval on sequences. ACM Computing Surveys (CSUR) 46, 4 (2014), 1–47.
[24]
Ilan Newman. 1991. Private vs. Common Random Bits in Communication Complexity. Inf. Process. Lett. 39, 2 (1991), 67–71.
[25]
Edward R Scheinerman and Daniel H Ullman. 2011. Fractional graph theory: a rational approach to the theory of graphs. Courier Corporation.
[26]
Gábor Simonyi and Gábor Tardos. 2011. On directed local chromatic number, shift graphs, and Borsuk-like graphs. Journal of Graph Theory 66, 1 (2011), 65–82.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms Just Accepted
EISSN:1549-6333
Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 08 August 2024
Accepted: 28 May 2024
Revised: 18 May 2024
Received: 19 July 2023

Check for updates

Author Tags

  1. hashing
  2. data structures
  3. lower bounds
  4. fractional chromatic number

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 104
    Total Downloads
  • Downloads (Last 12 months)104
  • Downloads (Last 6 weeks)34
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media