Abstract
With the aim of obtaining time/space improvements in classic Data Structures, an emerging trend is to combine Machine Learning techniques with the ones proper of Data Structures. This new area goes under the name of Learned Data Structures. The motivation for its study is a perceived change of paradigm in Computer Architectures that would favour the use of Graphics Processing Units and Tensor Processing Units over conventional Central Processing Units. In turn, that would favour the use of Neural Networks as building blocks of Classic Data Structures. Indeed, Learned Bloom Filters, which are one of the main pillars of Learned Data Structures, make extensive use of Neural Networks to improve the performance of classic Filters. However, no use of Neural Networks is reported in the realm of Learned Indexes, which is another main pillar of that new area. In this contribution, we provide the first, and much needed, comparative experimental analysis regarding the use of Neural Networks as building blocks of Learned Indexes. The results reported here highlight the need for the design of very specialized Neural Networks tailored to Learned Indexes and it establishes solid ground for those developments. Our findings, methodologically important, are of interest to both Scientists and Engineers working in Neural Networks Design and Implementation, in view also of the importance of the application areas involved, e.g., Computer Networks and Databases.
This research is funded in part by MIUR Project of National Relevance 2017WR7SHH “Multicriteria Data Structures and Algorithms: from compressed to learned indexes, and beyond”. We also acknowledge an NVIDIA Higher Education and Research Grant (donation of a Titan V GPU).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
https://github.com/globosco/A-Benchmarking-platform-for-atomic-learned-indexes
Abadi, M.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2015). http://download.tensorflow.org/paper/whitepaper2015.pdf
Amato, D., Lo Bosco, G., Giancarlo, R.: Learned sorted table search and static indexes in small model space. CoRR, abs/2107.09480 (2021)
D. Amato, G. Lo Bosco, and R. Giancarlo. Learned sorted table search and static indexes in small model space (Extended Abstract). In Proc. of the 20-th Italian Conference in Artificial Intelligence (AIxIA), to appear in Lecture Notes in Computer Science, 2021
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, USA (1995)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Boffa, A., Ferragina, P., Vinciguerra, G.: A “learned” approach to quicken and compress rank/select dictionaries. In: Proceedings of the SIAM Symposium on Algorithm Engineering and Experiments (ALENEX) (2021)
Broder, A., Mitzenmacher, M.: Network applications of bloom filters: a survey. Internet Math. 1(4), 485–509 (2003)
Dai, Z., Shrivastava, A.: Adaptive learned bloom filter (Ada-BF): efficient utilization of the classifier with application to real-time information filtering on the web. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 11700–11710. Curran Associates Inc. (2020)
Ding, J., et al.: Alex: an updatable adaptive learned index. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020, pp. 969–984. Association for Computing Machinery, New York (2020)
Ferragina, P., Vinciguerra, G.: Learned data structures. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds.) Recent Trends in Learning From Data. SCI, vol. 896, pp. 5–41. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43883-8_2
Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. PVLDB 13(8), 1162–1175 (2020)
Freedman, D.: Statistical Models: Theory and Practice. Cambridge University Press, August 2005
Fumagalli, G., Raimondi, D., Giancarlo, R., Malchiodi, D., Frasca, M.: On the choice of general purpose classifiers in learned bloom filters: an initial analysis within basic filters. In: Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods (ICPRAM), pp. 675–682 (2022)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press (2016)
Khuong, P.V., Morin, P.: Array layouts for comparison-based searching. J. Exp. Algorithmics 22, 1.3:1–1.3:39 (2017)
Knuth, D.E.: The Art of Computer Programming, vol. 3 (Sorting and Searching) (1973)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504. ACM (2018)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Marcus, R., Kipf, A., van Renen, A., Stoian, M., Misra, S., Kemper, A., Neumann, T., Kraska, T.: Benchmarking learned indexes. Proc. VLDB Endow. 14(1), 1–13 (2020)
Mitzenmacher, M.: A model for learned bloom filters and optimizing by sandwiching. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates Inc. (2018)
Mitzenmacher, M., Vassilvitskii, S.: Algorithms with predictions. CoRR, abs/2006.09123 (2020)
Moore, G.E.: Cramming more components onto integrated circuits. Electronics, 38(8), April 1965
Ohn, I., Kim, Y.: Smooth function approximation by deep neural networks with general activation functions. Entropy 21(7), 627 (2019)
Sato, K., Young, C., Patterson, D.: An in-depth look at Google’s first tensor processing unit (2017). https://cloud.google.com/blog/products/ai-machine-learning/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
Vaidya, K., Knorr, E., Kraska, T., Mitzenmacher, M.: Partitioned learned bloom filter. ArXiv, abs/2006.03176 (2020)
Wang, B.: Moore’s law is dead but GPU will get 1000x faster by 2025 (2017). https://www.nextbigfuture.com/2017/06/moore-law-is-dead-but-gpu-will-get-1000x-faster-by-2025.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Amato, D., Bosco, G.L., Giancarlo, R. (2022). On the Suitability of Neural Networks as Building Blocks for the Design of Efficient Learned Indexes. In: Iliadis, L., Jayne, C., Tefas, A., Pimenidis, E. (eds) Engineering Applications of Neural Networks. EANN 2022. Communications in Computer and Information Science, vol 1600. Springer, Cham. https://doi.org/10.1007/978-3-031-08223-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-08223-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08222-1
Online ISBN: 978-3-031-08223-8
eBook Packages: Computer ScienceComputer Science (R0)