Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Compact representations of temporal databases

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

This paper investigates the storage compactness of temporal merges. A temporal merge is a joint representation of a set of snapshots of the same database at different points in time. An axiomatic requirement for temporal merges is that each individual snapshot must be derivable from it. We first show that the usual temporal extension of a database is a special kind of temporal merge called a direct merge. For direct merges, finding the most compact representation is easy for separate relations. However, if snapshots feature foreign key dependencies, the representation may contain more tuples than strictly needed. We therefore study inclusion dependencies in temporal databases and show how to use them while maintaining minimality in our representation. Next, we argue that direct merges imply redundancy when they contain non-contiguous, value-equivalent tuples. We therefore introduce a more general kind of temporal merge known as a coherent merge and study its properties in depth. Deriving a minimal coherent merge from a direct one is shown to be NP-hard. However, several greedy algorithms are demonstrated to provide decent results in quadratic time. Next, an incremental algorithm for construction of coherent merges is presented. We provide experimental results on real-life data sets that support the usefulness of the contributions of this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. By common convention, the concatenation of two sets of attributes signifies the union of those sets.

  2. Coalescing is the operation of merging value-equivalent tuples with overlapping or meeting validity intervals.

  3. GRID database: http://grid.ac.

  4. https://takeout.google.com.

References

  1. Jensen, C., Snodgrass, R.: Temporal data management. IEEE Trans. Knowl. Data Eng. 11(1), 36–44 (1999)

    Article  Google Scholar 

  2. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd edn. Wiley, New York (2013)

    Google Scholar 

  3. Castelltort, A., Laurent, A.: Representing history in graph-oriented NoSQL databases: a versioning system. In: 8th International Conference on Digital Information Management, ICDIM 2013, vol. 1, pp. 228–234 (2013). https://doi.org/10.1109/ICDIM.2013.6694022

  4. Böhlen, M.: Temporal database system implementations. SIGMOD Rec. 24(4), 53–60 (1995)

    Article  Google Scholar 

  5. Snodgrass, R., Ahn, I.: A taxonomy of time in databases. ACM SIGMOD Rec. 14(4), 236–246 (1985)

    Article  Google Scholar 

  6. Jensen, C., Snodgrass, R.: Semantics of time-varying information. Inf. Syst. 21(4), 311–352 (1996)

    Article  Google Scholar 

  7. Özsoyoglu, G., Snodgrass, R.: Temporal and real-time databases: a survey. IEEE Trans. Knowl. Data Eng. 7(4), 513–532 (1995)

    Article  Google Scholar 

  8. Jensen, C., et al.: The Consensus Glossary of Temporal Database Concepts February 1998 Version, pp. 367–405. Springer, Berlin (1998)

    Google Scholar 

  9. Lorentzos, N., Johnson, R.: Extending relational algebra to manipulate temporal data. Inf. Syst. 13(3), 289–296 (1988)

    Article  MATH  Google Scholar 

  10. Navathe, S., Ahmed, R.: A temporal relational model and a query language. Inf. Sci. 49, 147–175 (1989)

    Article  MATH  Google Scholar 

  11. Ahn, I., Snodgrass, R.: Partitioned storage for temporal databases. Inf. Syst. 13(4), 369–391 (1988)

    Article  MATH  Google Scholar 

  12. Lum, V., Dadam, P., Erbe, R., Guenauer, J., Pistor, P., Walch, G., Werner, H., Woodfill, J.: Design of an Integrated DBMS to Support Advanced Applications, pp. 31–49. Springer, Berlin (1987)

    Google Scholar 

  13. Gunadhi, H., Segev, A.: A framework for query optimization in temporal databases. In: Proceedings of the 5th International Conference on Statistical and Scientific Database Management, pp. 131–147 (1988)

  14. Gao, D., Jensen, C., Snodgrass, R., Soo, M.: Join operations in temporal databases. VLDB J. 14, 2–29 (2005). https://doi.org/10.1007/s00778-003-0111-3

    Article  Google Scholar 

  15. Salzberg, B., Tsotras, V.: Comparison of access methods for time-evolving data. ACM Comput. Surv. 31(2), 185–221 (1999)

    Article  Google Scholar 

  16. Dyreson, C.E.: Temporal coalescing with now, granularity, and incomplete information. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 169–180 (2003). https://doi.org/10.1145/872757.872779

  17. Bebel, B., Krolikowski, Z., Wrembel, R.: Formal approach to modelling a multiversion data warehouse. Bull. Pol. Acad. Sci. Tech. Sci. 54(1), 51–62 (2006)

    MATH  Google Scholar 

  18. Amagasa, T., Yoshikawa, M., Uemura, S.: A data model for temporal xml documents. In: Ibrahim, M., Küng, J., Revell, N. (eds.) Proceedings of the 11th International Conference, DEXA 2000, pp. 334–344. Springer, Berlin (2000). https://doi.org/10.1007/3-540-44469-6_31

    Google Scholar 

  19. Amagasa, T., Yoshikawa, M., Uemura, S.: Realizing temporal xml repositories using temporal relational databases. In: Proceedings of the 3rd International Symposium on Cooperative Database Systems for Advanced Applications. IEEE (2001). https://doi.org/10.1109/CODAS.2001.945150

  20. Grandi, F., Mandreoli, F.: The valid web: an XML/XSL infrastructure for temporal management of web documents. In: Yakhno, T. (ed.) Lecture Notes in Computer Science 1909—Proceedings of the 1st International Conference on Advances in Information Systems, pp. 294–303. Springer, Berlin (2000)

  21. Grandi, F., Mandreoli, F., Tiberio, P.: Temporal modelling and management of normative documents in XML format. Data Knowl. Eng. 54, 327–354 (2005)

    Article  Google Scholar 

  22. Grandi, F., Mandreoli, F., Tiberio, P., Bergonzini, M.: A temporal data model and management system for normative texts in xml format. In: Proceedings of the 5th ACM International Workshop on Web Information and Data Management, pp. 29–36. ACM, New York (2003)

  23. Chien, S.Y., Tsotras, V.J., Zaniolo, C., Zhang, D.: Efficient complex query support for multiversion xml documents. In: Proceedings of the 8th International Conference on Extending Database Technology, pp. 161–178. Springer, Berlin (2002). https://doi.org/10.1007/3-540-45876-X_12

  24. Wang, F.: XML-based support for database histories and document versions. Ph.D. Thesis, University of California (2004)

  25. Wang, F., Zaniolo, C.: Temporal queries in xml document archives and web warehouses. In: Proceedings of the 10th International Symposium on Temporal Representation and Reasoning and the Fourth International Conference on Temporal Logic. IEEE (2003). https://doi.org/10.1109/TIME.2003.1214879

  26. Ali, K.A., Pokorny, J.: A Comparison of XML-Based Temporal Models, pp. 339–350. Springer, Berlin (2009)

    Google Scholar 

  27. Gergatsoulis, M., Stavrakas, Y.: Representing Changes in XML Documents Using Dimensions, pp. 208–222. IEEE, New York (2003). https://doi.org/10.1007/978-3-540-39429-7_14

    Google Scholar 

  28. Bellini, P., Bruno, I., Nesi, P., Rauch, N.: Graph databases methodology and tool supporting index/store versioning. J. Vis. Lang. Comput. 31, 222–229 (2015). https://doi.org/10.1016/j.jvlc.2015.10.018

    Article  Google Scholar 

  29. Gutierrez, C., Hurtado, C., Vaisman, A.: Introducing time into RDF. IEEE Trans. Knowl. Data Eng. 19(2), 207–218 (2007)

    Article  Google Scholar 

  30. Graube, M., Hensel, S., Urbas, L.: R43ples: revisions for triples—an approach for version control in the semantic web. In: LDQ@SEMANTICS, pp. 1–8. CEUR-WS.org (2014)

  31. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  MATH  Google Scholar 

  32. Wijsen, J.: Temporal FDs on complex objects. ACM Trans. Database Syst. 24(1), 127–176 (1999)

    Article  Google Scholar 

  33. Wijsen, J.: Temporal Dependencies, pp. 2960–2966. Springer, Berlin (2009)

    Google Scholar 

  34. Casanova, M., Fagin, R., Papadimitriou, C.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28, 29–59 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  35. Rescher, N., Manor, R.: On inference from inconsistent premises. Theory Decis. 1, 179–217 (1970)

    Article  MATH  Google Scholar 

  36. Destercke, S., Dubois, D., Chojnacki, E.: Possibilistic information fusion using maximal coherent subsets. IEEE Trans. Fuzzy Syst. 17(1), 79–92 (2009). https://doi.org/10.1109/TFUZZ.2008.2005731

    Article  Google Scholar 

  37. Robson, J.: Algorithms for maximum independent sets. J. Algorithms 7(3), 425–440 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  38. Sakai, S., Togasaki, M., Yamazaki, K.: A note on greedy algorithms for the maximum weighted independent set problem. Discrete Appl. Math. 126(2), 313–322 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  39. Chen, P.P.S.: The entity-relationship model-toward a unified view of data. ACM Trans. Database Syst. 1(1), 9–36 (1976). https://doi.org/10.1145/320434.320440

    Article  MathSciNet  Google Scholar 

  40. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231. AAAI Press, New York (1996)

  41. Allen, J.F.: Maintaining knowledge about temporal intervals. Commun. ACM 26(11), 832–843 (1983)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antoon Bronselaer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bronselaer, A., Billiet, C., De Mol, R. et al. Compact representations of temporal databases. The VLDB Journal 28, 473–496 (2019). https://doi.org/10.1007/s00778-018-0535-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0535-4

Keywords

Navigation