Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3183713.3183758acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Incremental View Maintenance with Triple Lock Factorization Benefits

Published: 27 May 2018 Publication History

Abstract

We introduce F-IVM, a unified incremental view maintenance (IVM) approach for a variety of tasks, including gradient computation for learning linear regression models over joins, matrix chain multiplication, and factorized evaluation of conjunctive queries.
F-IVM is a higher-order IVM algorithm that reduces the maintenance of the given task to the maintenance of a hierarchy of increasingly simpler views. The views are functions mapping keys, which are tuples of input data values, to payloads, which are elements from a task-specific ring. Whereas the computation over the keys is the same for all tasks, the computation over the payloads depends on the task. F-IVM achieves efficiency by factorizing the computation of the keys, payloads, and updates.
We implemented F-IVM as an extension of DBToaster. We show in a range of scenarios that it can outperform classical first-order IVM, DBToaster's fully recursive higher-order IVM, and plain recomputation by orders of magnitude while using less memory.

References

[1]
Create Indexed Views. http://msdn.microsoft.com/en-us/library/ms191432.aspx.
[2]
Higgs Twitter Dataset. https://snap.stanford.edu/data/higgs-twitter.html.
[3]
Materialized View Concepts and Architecture. http://docs.oracle.com/cd/B28359_01/server.111/b28326/repmview.htm.
[4]
D. J. Abadi, Y. Ahmad, M. Balazinska, et al. The Design of the Borealis Stream Processing Engine. In CIDR, volume 5, pages 277--289, 2005.
[5]
M. Abo Khamis, H. Q. Ngo, and A. Rudra. FAQ: Questions Asked Frequently. In PODS, pages 13--28, 2016.
[6]
M. Aref, B. ten Cate, T. J. Green, B. Kimelfeld, et al. Design and Implementation of the LogicBlox System. In SIGMOD, pages 1371--1382, 2015.
[7]
N. Bakibayev, T. Kociský, D. Olteanu, and J. Závodnỳ. Aggregation and Ordering in Factorised Databases. PVLDB, 6(14):1990--2001, 2013.
[8]
C. Berkholz, J. Keppeler, and N. Schweikardt. Answering Conjunctive Queries under Updates. In PODS, pages 303--318, 2017.
[9]
P. G. Brown. Overview of SciDB: Large Scale Array Storage, Processing and Analysis. In SIGMOD, pages 963--968, 2010.
[10]
B. Chandramouli, J. Goldstein, et al. Trill: A high-performance Incremental Query Processor for Diverse Analytics. PVLDB, 8(4):401--412, 2014.
[11]
L. Chen, A. Kumar, J. F. Naughton, and J. M. Patel. Towards linear algebra over normalized data. CoRR, abs/1612.07448, 2016. To appear in PVLDB 2017.
[12]
R. Chirkova and J. Yang. Materialized Views. Found. &Trends in DB, 4(4):295--405, 2012.
[13]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2009.
[14]
A. Elgohary, M. Boehm, P. J. Haas, F. R. Reiss, and B. Reinwald. Compressed linear algebra for large-scale machine learning. PVLDB, 9(12):960--971, 2016.
[15]
R. Fagin, A. O. Mendelzon, and J. D. Ullman. A simplied universal relation assumption and its properties. ACM Trans. Database Syst., 7(3):343--360, 1982.
[16]
X. Feng, A. Kumar, B. Recht, and C. Ré. Towards a Unified Architecture for In-RDBMS Analytics. In SIGMOD, pages 325--336, 2012.
[17]
T. J. Green, G. Karvounarakis, and V. Tannen. Provenance Semirings. In PODS, pages 31--40, 2007.
[18]
T. J. Green, D. Olteanu, and G. Washburn. Live programming in the LogicBlox system: A MetaLogiQL approach. PVLDB, 8(12):1782--1791, 2015.
[19]
P. Gupta, N. Koudas, E. Shang, R. Johnson, and C. Zuzarte. Processing Analytical Workloads Incrementally. CoRR, abs/1509.05066, 2015.
[20]
J. M. Hellerstein, C. Ré, F. Schoppmann, D. Z. Wang, E. Fratkin, et al. The MADlib Analytics Library or MAD Skills, the SQL. PVLDB, 5(12):1700--1711, 2012.
[21]
M. Idris, M. Ugarte, and S. Vansummeren. The Dynamic Yannakakis Algorithm: Compact and Efficient Query Processing Under Updates. In SIGMOD, pages 1259--1274, 2017.
[22]
Y. Katsis, K. W. Ong, Y. Papakonstantinou, and K. K. Zhao. Utilizing IDs to Accelerate Incremental View Maintenance. In SIGMOD, pages 1985--2000, 2015.
[23]
M. A. Khamis, H. Ngo, X. Nguyen, D. Olteanu, and M. Schleich. In-Database Learning with Sparse Tensors. In PODS, 2018.
[24]
C. Koch. Incremental Query Evaluation in a Ring of Databases. In PODS, pages 87--98, 2010.
[25]
C. Koch, Y. Ahmad, O. Kennedy, M. Nikolic, A. Nötzli, D. Lupei, et al. DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views. VLDB J., 23(2):253--278, 2014.
[26]
T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM Rev., 51(3):455--500, 2009.
[27]
A. Kumar, M. Boehm, and J. Yang. Data management in machine learning: Challenges, techniques, and systems. In SIGMOD, pages 1717--1722, 2017.
[28]
A. Kumar, J. F. Naughton, and J. M. Patel. Learning Generalized Linear Models Over Normalized Data. In SIGMOD, pages 1969--1984, 2015.
[29]
S. R. Madden et al. TinyDB: An Acquisitional Query Processing System for Sensor Networks. TODS, 30(1):122--173, 2005.
[30]
X. Meng et al. MLlib: Machine Learning in Apache Spark. J. Mach. Learn. Res., 17(1):1235--1241, 2016.
[31]
H. Q. Ngo, C. Ré, and A. Rudra. Skew Strikes Back: New Developments in the Theory of Join Algorithms. SIGMOD Record, 42(4):5--16, 2013.
[32]
M. Nikolic, M. Dashti, and C. Koch. How to win a hot dog eating contest: Distributed incremental view maintenance with batch updates. In SIGMOD, pages 511--526, 2016.
[33]
M. Nikolic, M. Elseidy, and C. Koch. LINVIEW: Incremental View Maintenance for Complex Analytical Queries. In SIGMOD, pages 253--264, 2014.
[34]
M. Nikolic and D. Olteanu. Incremental view maintenance with triple lock factorization benefits. CoRR, abs/1703.07484, 2017.
[35]
D. Olteanu, C. Koch, and L. Antova. World-set decompositions: Expressiveness and efficient algorithms. Theor. Comput. Sci., 403(2--3):265--284, 2008.
[36]
D. Olteanu and M. Schleich. F: Regression Models over Factorized Views. PVLDB, 9(13):1573--1576, 2016.
[37]
D. Olteanu and M. Schleich. Factorized Databases. SIGMOD Rec., 45(2):5--16, 2016.
[38]
D. Olteanu and J. Závodnỳ. Size Bounds for Factorised Representations of Query Results. TODS, 40(1):2:1--2:44, 2015.
[39]
N. Polyzotis, S. Roy, S. E. Whang, and M. Zinkevich. Data management challenges in production machine learning. In SIGMOD, pages 1723--1726, 2017.
[40]
C. Qin and F. Rusu. Speculative Approximations for Terascale Distributed Gradient Descent Optimization. In DanaC, pages 1--10, 2015.
[41]
S. Rendle. Scaling Factorization Machines to Relational Data. PVLDB, 6(5):337--348, 2013.
[42]
M. Schleich, D. Olteanu, and R. Ciucanu. Learning Linear Regression Models over Factorized Joins. In SIGMOD, pages 3--18, 2016.
[43]
S. Shalev-Shwartz et al. Online Learning and Online Convex Optimization. Found. &Trends in ML, 4(2):107--194, 2012.
[44]
N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos. Tensor decomposition for signal processing and machine learning. Trans. Sig. Proc., 65(13):3551--3582, 2017.
[45]
C. Whaley and J. Dongarra. Automatically tuned linear algebra software. In PPSC, 1999.
[46]
M. Zaharia et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In NSDI, pages 15--28, 2012.
[47]
W. Zhao, F. Rusu, B. Dong, K. Wu, and P. Nugent. Incremental View Maintenance over Array Data. In SIGMOD, pages 139--154, 2017.

Cited By

View all
  • (2024)Streaming enumeration on nested documentsACM Transactions on Database Systems10.1145/3701557Online publication date: 25-Oct-2024
  • (2024)Insert-Only versus Insert-Delete in Dynamic Query EvaluationProceedings of the ACM on Management of Data10.1145/36958372:5(1-26)Online publication date: 7-Nov-2024
  • (2024)In-depth Analysis of Continuous Subgraph Matching in a Common Delta Query Compilation FrameworkProceedings of the ACM on Management of Data10.1145/36549502:3(1-27)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
May 2018
1874 pages
ISBN:9781450347037
DOI:10.1145/3183713
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. factorized representation
  2. incremental view maintenance
  3. materialized views
  4. query optimization
  5. rings
  6. stream processing

Qualifiers

  • Research-article

Funding Sources

  • ERC

Conference

SIGMOD/PODS '18
Sponsor:

Acceptance Rates

SIGMOD '18 Paper Acceptance Rate 90 of 461 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)6
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Streaming enumeration on nested documentsACM Transactions on Database Systems10.1145/3701557Online publication date: 25-Oct-2024
  • (2024)Insert-Only versus Insert-Delete in Dynamic Query EvaluationProceedings of the ACM on Management of Data10.1145/36958372:5(1-26)Online publication date: 7-Nov-2024
  • (2024)In-depth Analysis of Continuous Subgraph Matching in a Common Delta Query Compilation FrameworkProceedings of the ACM on Management of Data10.1145/36549502:3(1-27)Online publication date: 30-May-2024
  • (2024)In-Database Data ImputationProceedings of the ACM on Management of Data10.1145/36393262:1(1-27)Online publication date: 26-Mar-2024
  • (2024)Recent Increments in Incremental View MaintenanceCompanion of the 43rd Symposium on Principles of Database Systems10.1145/3635138.3654763(8-17)Online publication date: 9-Jun-2024
  • (2024)Hybrid Evaluation for Occlusion-based Explanations on CNN Inference Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00078(953-966)Online publication date: 13-May-2024
  • (2023)JoinBoost: Grow Trees over Normalized Data Using Only SQLProceedings of the VLDB Endowment10.14778/3611479.361150916:11(3071-3084)Online publication date: 24-Aug-2023
  • (2023)DBSP: Automatic Incremental View Maintenance for Rich Query LanguagesProceedings of the VLDB Endowment10.14778/3587136.358713716:7(1601-1614)Online publication date: 8-May-2023
  • (2023)Change Propagation Without JoinsProceedings of the VLDB Endowment10.14778/3579075.357908016:5(1046-1058)Online publication date: 1-Jan-2023
  • (2023)Lightweight Materialization for Fast Dashboards Over JoinsProceedings of the ACM on Management of Data10.1145/36267351:4(1-27)Online publication date: 12-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media