Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1765112.1765137guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On warehousing historical web information

Published: 09 October 2000 Publication History

Abstract

We present a temporal web data model designed for warehousing historical data from World Wide Web that changes with time. As the Web is now populated with large volume of web information, it has become necessary to capture some useful web information in a data warehouse that supports further intelligent data analysis. Nevertheless, due to the unstructured and dynamic nature of Web, the traditional relational model and its temporal variants could not be used to build such a data warehouse. In this paper, we therefore propose a temporal web data model that captures the connectivities of web documents and their content in the form of temporal web tables. To support the analysis of web data that evolve with time, valid time intervals are associated with each web document. To manipulate temporal web tables, we define a variety of web operators and illustrate their usefulness using some realistic motivating examples.

References

[1]
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel query language for semistructured data. International Journal on Digital Libraries , 1(1):68-88, April 1997.
[2]
G. Arocena and A. Mendelzon. WebOQL: Restructuring documents, databases and webs. In Proceedings of ICDE'98 , Orlando, Florida, February 1998.
[3]
G. Arocena, A. Mendelzon, and G. Mihaila. Applications of a web query language. In Proceedings of the 6th International WWW Conference , Santa Clara, April 1997.
[4]
P. Atzeni, G. Mecca, and P. Merialdo. To weave the web. In Proceedings of the 23rd VLDB Conference , Athens, Greece, 1997.
[5]
P. Buneman, S. Davidson, and G. Hillebrand. A querying language and optimization techniques for unstructured data. In Proceedings of ACM SIGMOD Conference on Management of Data , pages 505-516, Montreal, Canada, 1996.
[6]
J. Clifford and A. Croker. The historical relational data model (HRDM) and algebra based on lifespans. In Proceedings of the International Conference on Data Engineering , pages 528-537. IEEE Computer Society, February 1987.
[7]
M. Fernandez, D. Florescu, J. Kang, and A. Levy. Catching the boat with Strudel: Experiences with a web-site management system. In Proceedings of ACM SIGMOD Conference on Management of Data , Seattle, WA, 1998.
[8]
R. Fielding, J. Gettys, J. Mogul, H. Frystyk, and T. Berners-Lee. Hypertext Transfer Protocol - HTTP/1.1 , Jan 1997.
[9]
D. Florescu, A. Levy, and A. Mendelzon. Database techniques for the world-wide web: A survey. ACM SIGMOD Record , 27(3):59-74, September 1998.
[10]
R. Himmeroder, G. Lausen, B. Ludascher, and C. Schlepphorst. On a declarative semantics for web queries. In Proceedings of the 5th International Conference on Deductive and Object-Oriented Databases , Montreux, Switzerland, December 1997.
[11]
D. Konopnicki and O. Shmueli. W3QS: A query system for the world wide web. In Proceedings of the 21st VLDB Conference , Zurich, Switzerland, 1995.
[12]
L. V. S. Lakshmanan, F. Sadri, and L. N. Subramanian. A declarative language for querying and restructuring the web. In Proceedings of the 6th International Workshop on Research Issues in Data Engineering, RIDE '96 , New Orleans, February 1996.
[13]
A. Mendelzon, G. Mihaila, and T. Milo. Querying the world wide web. International Journal on Digital Libraries , 1(1):54-67, April 1997.
[14]
S.B. Navathe and R. Ahmed. A temporal relational model and a query language. Information Sciences , 49(1-3):147-175, 1989.
[15]
W.-K. Ng, E.-P. Lim, C.-T Huang, S.S. Bhowmick, and F.-Q. Qin. Web warehousing: An algebra for web information. In Proceedings of IEEE International Conference on Advances in Digital Libraries (ADL'98) , April 1998.
[16]
Richard Snodgrass. The temporal query language TQuel. ACM Transactions on Database Systems , 12(2):247-298, June 1987.

Cited By

View all
  1. On warehousing historical web information

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ER'00: Proceedings of the 19th international conference on Conceptual modeling
    October 2000
    587 pages
    ISBN:3540410724
    • Editors:
    • Alberto H. F. Laender,
    • Stephen W. Liddle,
    • Veda C. Storey

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 09 October 2000

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media