Abstract
A key aspect of any hash map design is the problem of dynamically resizing it in order to deal with hash collisions. Compression in tree-based hash maps is the ability of reducing the depth of the internal hash levels that support the hash map. In this context, elasticity refers to the ability of automatically resizing the internal data structures that support the hash map operations in order to meet varying workloads, thus optimizing the overall memory consumption of the hash map. This work extends a previous lock-free hash trie map design to support elastic hashing, i.e., expand saturated hash levels and compress unused hash levels, such that, at each point in time, the number of levels in a path is adjusted, as closely as possible, to the set of keys that is stored in the data structure. To materialize our design, we introduce a new compress operation for hash levels, which requires redesigning the existing search, insert, remove and expand operations in order to maintain the lock-freedom property of the data structure. Experimental results show that elasticity effectively improves the search operation and, in doing so, our design becomes very competitive when compared to other state-of-the-art designs implemented in Java.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Note that with persistent memory, keys must remain pinned to their initial memory references, i.e., the copy of a key to a new memory reference is not allowed.
The designs and the benchmarks are available at https://github.com/miar/ffpe.
Since we are using 8 bucket entries per hash level, all chain nodes will be located in a hash level with depth 8.
We are not including memory usage results since we were not able to obtain meaningful results from JVM about the memory footprints of the several designs. We used the formula ‘Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()’ but the results obtained were not accurate, with no good reasons to have big differences across the different runs of the same design.
The approximated number of keys in the data structures at the end of each benchmark is approximated by the expression \(8^8 * (RS + RI)\), where RS and RI are the ratios of the keys searched and the keys inserted, respectively, described in each benchmark.
SMP/NUMA bottlenecks are analyzed in detail in Drepper’s work [11].
References
Areias M, Rocha R (2012) An efficient and scalable memory allocator for multithreaded tabled evaluation of logic programs. In: International conference on parallel and distributed systems. IEEE Computer Society, pp 636–643
Areias M, Rocha R (2018) On extending a fixed size, persistent and lock-free hash map design to store sorted keys. In: International symposium on parallel and distributed processing with applications. IEEE Computer Society, Melbourne, pp 415–422
Bagwell P (2001) Ideal hash trees. Es Grands Champs, 1195
Bayer R, Mccreight EM (1972) Organization and maintenance of large ordered indexes. Acta Inf 1:173–189
Brown T (2017) A template for implementing fast lock-free trees using HTM. In: Proceedings of the ACM symposium on principles of distributed computing, PODC 2017. ACM, pp 293–302
Brown T (2017) Techniques for constructing efficient lock-free data structures. Ph.D. thesis, University of Toronto
Brown T, Prokopec A, Alistarh D (2020) Non-blocking interpolation search trees with doubly-logarithmic running time. Association for Computing Machinery, New York, pp 276–291
Chen C, Choudhury V, Newton R (2017) Adaptive lock-free data structures in haskell: a general method for concurrent implementation swapping. SIGPLAN Not 52(10):197–211
Codd EF (1970) A relational model for large shared data banks. Commun ACM 13(6):377–387
Comer D (1979) Ubiquitous b-tree. ACM Comput Surv 11:121–137
Drepper U (2007) What every programmer should know about memory - version 1.0. Technical report, Red Hat, Inc
Fredkin E (1962) Trie memory. Commun ACM 3:490–499
Grossi R, Ottaviano G (2015) Fast compressed tries through path decompositions. J Exp Algorithmics 19:1
Harris TL (2001) A pragmatic implementation of non-blocking linked-lists. In: International conference on distributed computing, DISC ’01. Springer, pp 300–314
Herlihy M, Wing JM (1990) Linearizability: a correctness condition for concurrent objects. ACM Trans Program Lang Syst 12(3):463–492
Herlihy M, Lev Y, Luchangco V, Shavit N (2006) A provably correct scalable concurrent skip list. In: International conference on principles of distributed systems. Technical report, Bordeaux
Herlihy M, Shavit N (2011) On the nature of progress. In: Principles of distributed systems. LNCS, vol 7109. Springer, pp 313–328
Herlihy M, Wing JM (1987) Axioms for concurrent objects. In: ACM symposium on principles of programming languages. ACM, pp 13–26
Kim C, Chhugani J, Satish N, Sedlar E, Nguyen AD, Kaldewey T, Lee VW, Brandt SA, Dubey P (2011) Designing fast architecture-sensitive tree search on modern multicore/many-core processors. ACM Trans Database Syst 36(4):22:1-22:34
Knuth DE (1998) The art of computer programming, volume 3: sorting and searching, 2nd ed. Addison-Wesley Longman
Lehman TJ, Carey MJ (1986) Query processing in main memory database management systems. In: Proceedings of the 1986 ACM SIGMOD international conference on management of data, SIGMOD ’86. ACM, pp 239–250
Mauchly J (1946) Theory and techniques for design of electronic digital computers
Mehlhorn K, Tsakalidis A (1993) Dynamic interpolation search. J ACM 40(3):621–634
Mehta DP, Sahni S (2004) Handbook of data structures and applications. Chapman & Hall/CRC, Boca Raton
Michael MM (2002) High performance dynamic lock-free hash tables and list-based sets. In: ACM symposium on parallel algorithms and architectures. ACM, pp 73–82
Moreno P, Areias M, Rocha R (2020) A compression-based design for higher throughput in a lock-free hash map. In: Proceedings of the 26th international European conference on parallel and distributed computing (Euro-Par 2020), LNCS. Springer, Warsaw, pp 458–473
Peterson WW (1957) Addressing for random-access storage. IBM J Res Dev 1(2):130–146
Prokopec A (2018) Cache-tries: concurrent lock-free hash tries with constant-time operations. In: Proceedings of the 23rd ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’18. ACM, pp 137–151
Prokopec A (2018) Efficient lock-free removing and compaction for the cache-trie data structure. In: Euro-Par 2018: Parallel processing-24th International conference on parallel and distributed computing. Lecture notes in computer science, vol 11014. Springer, pp 575–589
Prokopec A, Bronson NG, Bagwell P, Odersky M (2012) Concurrent tries with efficient non-blocking snapshots. In: ACM symposium on principles and practice of parallel programming. ACM, pp 151–160
Sagiv Y (1986) Concurrent operations on B*-trees with overtaking. J Comput Syst Sci 33(2):275–296
Shalev O, Shavit N (2006) Split-ordered lists: lock-free extensible hash tables. J ACM 53(3):379–405
Skarlatos D, Kokolis A, Xu T, Torrellas J (2020) Elastic cuckoo page tables: rethinking virtual memory translation for parallelism. In: Proceedings of the twenty-fifth international conference on architectural support for programming languages and operating systems, ASPLOS ’20. ACM, pp 1093–1108
The Java Concurrency Package (JSR-166)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is financed by National Funds through the Portuguese funding agency, FCT—Fundação para a Ciência e a Tecnologia, within project UIDB/50014/2020. Miguel Areias was funded by the FCT Grant SFRH/BPD/108018/2015.
Rights and permissions
About this article
Cite this article
Areias, M., Rocha, R. On the correctness of a lock-free compression-based elastic mechanism for a hash trie design. Computing 104, 2279–2305 (2022). https://doi.org/10.1007/s00607-022-01085-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-022-01085-2