Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Tensor relational algebra for distributed machine learning system design

Published: 01 April 2021 Publication History

Abstract

We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

References

[1]
[n.d.]. Dask. https://dask.org/.
[2]
[n.d.]. PETSc. https://www.mcs.anl.gov/petsc/.
[3]
[n.d.]. PyTorch. https://pytorch.org/.
[4]
[n.d.]. TensorFlow. https://www.tensorflow.org/.
[5]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). 265--283.
[6]
Christopher Aberger, Andrew Lamb, Kunle Olukotun, and Christopher Ré. 2018. Levelheaded: A unified engine for business intelligence and linear algebra querying. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 449--460.
[7]
Mahmoud Abo Khamis, Hung Q Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich. 2018. In-database learning with sparse tensors. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 325--340.
[8]
Mahmoud Abo Khamis, Hung Q Ngo, and Atri Rudra. 2016. FAQ: questions asked frequently. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 13--28.
[9]
Ramesh C Agarwal, Susanne M Balle, Fred G Gustavson, Mahesh Joshi, and Prasad Palkar. 1995. A three-dimensional approach to parallel matrix multiplication. IBM Journal of Research and Development 39, 5 (1995), 575--582.
[10]
Satish Balay, Shrirang Abhyankar, Mark Adams, Jed Brown, Peter Brune, Kris Buschelman, Lisandro Dalcin, Alp Dener, Victor Eijkhout, W Gropp, et al. 2019. PETSc users manual. (2019).
[11]
Pablo Barceló, Nelson Higuera, Jorge Pérez, and Bernardo Subercaseaux. 2019. Expressiveness of Matrix and Tensor Query Languages in terms of ML Operators. In Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning. ACM, 9.
[12]
Brandon Barker. 2015. Message passing interface (mpi). In Workshop: High Performance Computing on Stampede, Vol. 262.
[13]
Peter Baumann, Andreas Dehmel, Paula Furtado, Roland Ritsch, and Norbert Widmann. 1998. The multidimensional database system RasDaMan. In SIGMOD Record, Vol. 27. ACM, 575--577.
[14]
Paul G Brown. 2010. Overview of SciDB: large scale array storage, processing and analysis. In SIGMOD. 963--968.
[15]
Zhuhua Cai, Zografoula Vagena, Luis Perez, Subramanian Arumugam, Peter J Haas, and Christopher Jermaine. 2013. Simulation of database-valued Markov chains using SimSQL. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 637--648.
[16]
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. 2018. {TVM}: An automated end-to-end optimizing compiler for deep learning. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 578--594.
[17]
Jaeyoung Choi, Jack J Dongarra, Roldan Pozo, and David W Walker. 1992. ScaLA-PACK: A scalable linear algebra library for distributed memory concurrent computers. In The Fourth Symposium on the Frontiers of Massively Parallel Computation. IEEE Computer Society, 120--121.
[18]
Jason Jinquan Dai, Yiheng Wang, Xin Qiu, Ding Ding, Yao Zhang, Yanzhang Wang, Xianyan Jia, Cherry Li Zhang, Yan Wan, Zhichao Li, et al. 2019. Bigdl: A distributed deep learning framework for big data. In Proceedings of the ACM Symposium on Cloud Computing. 50--60.
[19]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.
[20]
Oksana Dolmatova, Nikolaus Augsten, and Michael H Böhlen. 2020. A Relational Matrix Algebra and its Implementation in a Column Store. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2573--2587.
[21]
Amol Ghoting, Rajasekar Krishnamurthy, Edwin Pednault, Berthold Reinwald, Vikas Sindhwani, Shirish Tatikonda, Yuanyuan Tian, and Shivakumar Vaithyanathan. 2011. SystemML: Declarative machine learning on MapReduce. In ICDE. 231--242.
[22]
Rong Gu, Yun Tang, Chen Tian, Hucheng Zhou, Guanru Li, Xudong Zheng, and Yihua Huang. 2017. Improving execution concurrency of large-scale matrix multiplication on distributed data-parallel platforms. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2539--2552.
[23]
Donghyoung Han, Yoon-Min Nam, Jihye Lee, Kyongseok Park, Hyunwoo Kim, and Min-Soo Kim. 2019. DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs. In Proceedings of the 2019 International Conference on Management of Data. 759--774.
[24]
T Hunter. 2016. Tensorframes on google's tensorflow and apache spark. Bay Area Spark Meetup (2016).
[25]
Dylan Hutchison, Bill Howe, and Dan Suciu. 2017. LaraDB: A minimalist kernel for linear and relational algebra computation. In Proceedings of the 4th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond. ACM, 2.
[26]
Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, and Zekai J Gao. 2019. Declarative recursive computation on an RDBMS: or, why you should use a database for distributed machine learning. Proceedings of the VLDB Endowment 12, 7 (2019), 822--835.
[27]
Matthias Jasny, Tobias Ziegler, Tim Kraska, Uwe Roehm, and Carsten Binnig. 2020. DB4ML-An In-Memory Database Kernel with Machine Learning Support. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 159--173.
[28]
Mahmoud Abo Khamis, Hung Q Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich. 2018. AC/DC: in-database learning thunderstruck. In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning. 1--10.
[29]
Mijung Kim and K Selçuk Candan. 2014. Efficient static and dynamic in-database tensor decompositions on chunk-based array stores. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 969--978.
[30]
Mijung Kim and K Selçuk Candan. 2014. Tensordb: In-database tensor manipulation with tensor-relational query plans. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 2039--2041.
[31]
Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The tensor algebra compiler. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 1--29.
[32]
Sören Laue, Matthias Mitterreiter, and Joachim Giesen. 2020. A simple and efficient tensor calculus. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4527--4534.
[33]
Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith, Brian Vaughan, Pritam Damania, et al. [n.d.]. PyTorch Distributed: Experiences on Accelerating Data Parallel Training. Proceedings of the VLDB Endowment 13, 12 ([n. d.]).
[34]
Xupeng Li, Bin Cui, Yiru Chen, Wentao Wu, and Ce Zhang. 2017. Mlog: Towards declarative in-database machine learning. Proceedings of the VLDB Endowment 10, 12 (2017), 1933--1936.
[35]
Shangyu Luo, Zekai J Gao, Michael Gubanov, Luis L Perez, and Christopher Jermaine. 2018. Scalable linear algebra on a relational database system. IEEE Transactions on Knowledge and Data Engineering 31, 7 (2018), 1224--1238.
[36]
Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 785--794.
[37]
Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43--52.
[38]
Philipp Moritz, Robert Nishihara, Ion Stoica, and Michael I Jordan. 2015. Sparknet: Training deep networks in spark. arXiv preprint arXiv:1511.06051 (2015).
[39]
Matthew Rocklin. 2015. Dask: Parallel computation with blocked algorithms and task scheduling. In Proceedings of the 14th python in science conference, Vol. 126. Citeseer.
[40]
Edgar Solomonik and James Demmel. 2011. Communication-optimal parallel 2.5 D matrix multiplication and LU factorization algorithms. In European Conference on Parallel Processing. Springer, 90--109.
[41]
D Team et al. 2016. Deeplearning4j: Open-source distributed deep learning for the jvm. Apache Software Foundation License 2 (2016), 2.
[42]
Robert A Van De Geijn and Jerrell Watts. 1997. SUMMA: Scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience 9, 4 (1997), 255--274.
[43]
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv preprint arXiv:1802.04730 (2018).
[44]
Pete Warden. 2018. Speech commands: A dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018).
[45]
Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: cluster computing with working sets. In USENIX HotCloud. 1--10.

Cited By

View all
  • (2024)Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systemsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00845-033:5(1231-1255)Online publication date: 1-Sep-2024
  • (2023)DIVISIONProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619904(36036-36057)Online publication date: 23-Jul-2023
  • (2023)A Comparison of End-to-End Decision Forest Inference PipelinesProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624656(200-215)Online publication date: 30-Oct-2023
  • Show More Cited By

Index Terms

  1. Tensor relational algebra for distributed machine learning system design
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 14, Issue 8
    April 2021
    200 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 April 2021
    Published in PVLDB Volume 14, Issue 8

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systemsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00845-033:5(1231-1255)Online publication date: 1-Sep-2024
    • (2023)DIVISIONProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619904(36036-36057)Online publication date: 23-Jul-2023
    • (2023)A Comparison of End-to-End Decision Forest Inference PipelinesProceedings of the 2023 ACM Symposium on Cloud Computing10.1145/3620678.3624656(200-215)Online publication date: 30-Oct-2023
    • (2023)Indexed Streams: A Formal Intermediate Representation for Fused Contraction ProgramsProceedings of the ACM on Programming Languages10.1145/35912687:PLDI(1169-1193)Online publication date: 6-Jun-2023
    • (2023)Optimizing Tensor Programs on Flexible StorageProceedings of the ACM on Management of Data10.1145/35887171:1(1-27)Online publication date: 30-May-2023
    • (2023)Optimizing Tensor Computations: From Applications to Compilation and Runtime TechniquesCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589407(53-59)Online publication date: 4-Jun-2023
    • (2022)Query processing on tensor computation runtimesProceedings of the VLDB Endowment10.14778/3551793.355183315:11(2811-2825)Online publication date: 1-Jul-2022
    • (2022)Autoscheduling for sparse tensor algebra with an asymptotic cost modelProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523442(269-285)Online publication date: 9-Jun-2022
    • (2022)In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data ShuffleProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526150(1286-1300)Online publication date: 10-Jun-2022
    • (2022)Givens QR Decomposition over Relational DatabasesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526144(1948-1961)Online publication date: 10-Jun-2022
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media