Abstract
Motivated by decision support problems, data mining has been extensively addressed in the few past years. Nevertheless, the proposed approaches mainly concern flat representation of the data and to the best of our knowledge, not much effort has been spent on mining interesting patterns from such structures. In this paper we address the problem of mining structural association of semistructured data, or in other words the discovery of structural regularities among a large database of semistructured objects. This problem is much more complicated than the classical association rule one, since complex structures in the form of a labeled hirearchical objects partially ordered has to be taken into account.
Chapter PDF
Similar content being viewed by others
References
S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web. Morgan Kaufmann, 2000.
R. Agrawal, T. Imielinski, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. In SIGMOD’93, May 1993.
S. Abiteboul, and al. The Lorel Query Language for Semi-Structured Data. International Journal on Digital Libraries, 1(1):68–88, April1997.
R. Agrawal and R. Srikant. Fast Algorithms for Mining Generalized Association Rules. In VLDB’94, September 1994.
R. Agrawal and R. Srikant. Mining Sequential Patterns. In ICDE’95, March 1995.
P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A Query Language and Optimization Techniques for Unstructured Data. In SIGMOD’96.
S. Brin, R. Motwani, J.D. Ullman, and S. Tsur. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In SIGMOD’97, May 1997.
L. Dehaspe, H. Toivonen and R.D. King. Fining Frequent Substructures in Chemical Coumpounds. In KDD’98, August 1998.
U.M. Fayad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996.
D. Konopnicki and O. Shmueli. W3QS: A Query System for the World-Wide Web. In VLDB’95, September 1995.
P.A. Laur. Recherche de regularités dans des bases de données d’objets complexes. Technical Report, LIRMM, France, June 2000.
F. Masseglia, P. Poncelet and R. Cicchetti. An Efficient Algorithm for Web Usage Mining. In Networking and Information Systems Journal,October 1999.
F. Masseglia, F. Cathala, and P. Poncelet. The PSP Approach for Mining Sequential Patterns. In PKDD’98, September 1998.
S. Nestorov, S. Abiteboul, and R. Motwani. Extracting Schema from Semistructured Data. SIGMOD’98, 1998.
R. Srikant and R. Agrawal. Mining Sequential Patterns: Generalizations and Performance Improvements. In EDBT’96, September 1996.
A. Savasere, E. Omiecinski, and S. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. In VLDB’95, pages 432–444, Zurich, Switzerland, September 1995.
H. Toivonen. Sampling Large Databases for Association Rules. In VLDB’96, September 1996.
K. Wang and H.Q. Liu. Discovering Typical Structures of Documents: A Road Map Approach. In ACM SIGIR, August 1998.
K. Wang and H. Liu. Discovering Structural Association of Semistructured Data. IEEE TKDE, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Laur, P., Masseglia, F., Poncelet, P. (2000). Schema Mining: Finding Structural Regularity among Semistructured Data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_57
Download citation
DOI: https://doi.org/10.1007/3-540-45372-5_57
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive