Abstract
This paper investigates the storage compactness of temporal merges. A temporal merge is a joint representation of a set of snapshots of the same database at different points in time. An axiomatic requirement for temporal merges is that each individual snapshot must be derivable from it. We first show that the usual temporal extension of a database is a special kind of temporal merge called a direct merge. For direct merges, finding the most compact representation is easy for separate relations. However, if snapshots feature foreign key dependencies, the representation may contain more tuples than strictly needed. We therefore study inclusion dependencies in temporal databases and show how to use them while maintaining minimality in our representation. Next, we argue that direct merges imply redundancy when they contain non-contiguous, value-equivalent tuples. We therefore introduce a more general kind of temporal merge known as a coherent merge and study its properties in depth. Deriving a minimal coherent merge from a direct one is shown to be NP-hard. However, several greedy algorithms are demonstrated to provide decent results in quadratic time. Next, an incremental algorithm for construction of coherent merges is presented. We provide experimental results on real-life data sets that support the usefulness of the contributions of this paper.
Similar content being viewed by others
Notes
By common convention, the concatenation of two sets of attributes signifies the union of those sets.
Coalescing is the operation of merging value-equivalent tuples with overlapping or meeting validity intervals.
GRID database: http://grid.ac.
References
Jensen, C., Snodgrass, R.: Temporal data management. IEEE Trans. Knowl. Data Eng. 11(1), 36–44 (1999)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, New York (2013)
Castelltort, A., Laurent, A.: Representing history in graph-oriented NoSQL databases: a versioning system. In: 8th International Conference on Digital Information Management, ICDIM 2013, vol. 1, pp. 228–234 (2013). https://doi.org/10.1109/ICDIM.2013.6694022
Böhlen, M.: Temporal database system implementations. SIGMOD Rec. 24(4), 53–60 (1995)
Snodgrass, R., Ahn, I.: A taxonomy of time in databases. ACM SIGMOD Rec. 14(4), 236–246 (1985)
Jensen, C., Snodgrass, R.: Semantics of time-varying information. Inf. Syst. 21(4), 311–352 (1996)
Özsoyoglu, G., Snodgrass, R.: Temporal and real-time databases: a survey. IEEE Trans. Knowl. Data Eng. 7(4), 513–532 (1995)
Jensen, C., et al.: The Consensus Glossary of Temporal Database Concepts February 1998 Version, pp. 367–405. Springer, Berlin (1998)
Lorentzos, N., Johnson, R.: Extending relational algebra to manipulate temporal data. Inf. Syst. 13(3), 289–296 (1988)
Navathe, S., Ahmed, R.: A temporal relational model and a query language. Inf. Sci. 49, 147–175 (1989)
Ahn, I., Snodgrass, R.: Partitioned storage for temporal databases. Inf. Syst. 13(4), 369–391 (1988)
Lum, V., Dadam, P., Erbe, R., Guenauer, J., Pistor, P., Walch, G., Werner, H., Woodfill, J.: Design of an Integrated DBMS to Support Advanced Applications, pp. 31–49. Springer, Berlin (1987)
Gunadhi, H., Segev, A.: A framework for query optimization in temporal databases. In: Proceedings of the 5th International Conference on Statistical and Scientific Database Management, pp. 131–147 (1988)
Gao, D., Jensen, C., Snodgrass, R., Soo, M.: Join operations in temporal databases. VLDB J. 14, 2–29 (2005). https://doi.org/10.1007/s00778-003-0111-3
Salzberg, B., Tsotras, V.: Comparison of access methods for time-evolving data. ACM Comput. Surv. 31(2), 185–221 (1999)
Dyreson, C.E.: Temporal coalescing with now, granularity, and incomplete information. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 169–180 (2003). https://doi.org/10.1145/872757.872779
Bebel, B., Krolikowski, Z., Wrembel, R.: Formal approach to modelling a multiversion data warehouse. Bull. Pol. Acad. Sci. Tech. Sci. 54(1), 51–62 (2006)
Amagasa, T., Yoshikawa, M., Uemura, S.: A data model for temporal xml documents. In: Ibrahim, M., Küng, J., Revell, N. (eds.) Proceedings of the 11th International Conference, DEXA 2000, pp. 334–344. Springer, Berlin (2000). https://doi.org/10.1007/3-540-44469-6_31
Amagasa, T., Yoshikawa, M., Uemura, S.: Realizing temporal xml repositories using temporal relational databases. In: Proceedings of the 3rd International Symposium on Cooperative Database Systems for Advanced Applications. IEEE (2001). https://doi.org/10.1109/CODAS.2001.945150
Grandi, F., Mandreoli, F.: The valid web: an XML/XSL infrastructure for temporal management of web documents. In: Yakhno, T. (ed.) Lecture Notes in Computer Science 1909—Proceedings of the 1st International Conference on Advances in Information Systems, pp. 294–303. Springer, Berlin (2000)
Grandi, F., Mandreoli, F., Tiberio, P.: Temporal modelling and management of normative documents in XML format. Data Knowl. Eng. 54, 327–354 (2005)
Grandi, F., Mandreoli, F., Tiberio, P., Bergonzini, M.: A temporal data model and management system for normative texts in xml format. In: Proceedings of the 5th ACM International Workshop on Web Information and Data Management, pp. 29–36. ACM, New York (2003)
Chien, S.Y., Tsotras, V.J., Zaniolo, C., Zhang, D.: Efficient complex query support for multiversion xml documents. In: Proceedings of the 8th International Conference on Extending Database Technology, pp. 161–178. Springer, Berlin (2002). https://doi.org/10.1007/3-540-45876-X_12
Wang, F.: XML-based support for database histories and document versions. Ph.D. Thesis, University of California (2004)
Wang, F., Zaniolo, C.: Temporal queries in xml document archives and web warehouses. In: Proceedings of the 10th International Symposium on Temporal Representation and Reasoning and the Fourth International Conference on Temporal Logic. IEEE (2003). https://doi.org/10.1109/TIME.2003.1214879
Ali, K.A., Pokorny, J.: A Comparison of XML-Based Temporal Models, pp. 339–350. Springer, Berlin (2009)
Gergatsoulis, M., Stavrakas, Y.: Representing Changes in XML Documents Using Dimensions, pp. 208–222. IEEE, New York (2003). https://doi.org/10.1007/978-3-540-39429-7_14
Bellini, P., Bruno, I., Nesi, P., Rauch, N.: Graph databases methodology and tool supporting index/store versioning. J. Vis. Lang. Comput. 31, 222–229 (2015). https://doi.org/10.1016/j.jvlc.2015.10.018
Gutierrez, C., Hurtado, C., Vaisman, A.: Introducing time into RDF. IEEE Trans. Knowl. Data Eng. 19(2), 207–218 (2007)
Graube, M., Hensel, S., Urbas, L.: R43ples: revisions for triples—an approach for version control in the semantic web. In: LDQ@SEMANTICS, pp. 1–8. CEUR-WS.org (2014)
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
Wijsen, J.: Temporal FDs on complex objects. ACM Trans. Database Syst. 24(1), 127–176 (1999)
Wijsen, J.: Temporal Dependencies, pp. 2960–2966. Springer, Berlin (2009)
Casanova, M., Fagin, R., Papadimitriou, C.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28, 29–59 (1984)
Rescher, N., Manor, R.: On inference from inconsistent premises. Theory Decis. 1, 179–217 (1970)
Destercke, S., Dubois, D., Chojnacki, E.: Possibilistic information fusion using maximal coherent subsets. IEEE Trans. Fuzzy Syst. 17(1), 79–92 (2009). https://doi.org/10.1109/TFUZZ.2008.2005731
Robson, J.: Algorithms for maximum independent sets. J. Algorithms 7(3), 425–440 (1986)
Sakai, S., Togasaki, M., Yamazaki, K.: A note on greedy algorithms for the maximum weighted independent set problem. Discrete Appl. Math. 126(2), 313–322 (2003)
Chen, P.P.S.: The entity-relationship model-toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976). https://doi.org/10.1145/320434.320440
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231. AAAI Press, New York (1996)
Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bronselaer, A., Billiet, C., De Mol, R. et al. Compact representations of temporal databases. The VLDB Journal 28, 473–496 (2019). https://doi.org/10.1007/s00778-018-0535-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-018-0535-4