Schema Mining: Finding Structural Regularity among Semistructured Data

P.A Laur⁴,
F. Masseglia^4,5 &
P. Poncelet⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1910))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2734 Accesses
4 Citations

Abstract

Motivated by decision support problems, data mining has been extensively addressed in the few past years. Nevertheless, the proposed approaches mainly concern flat representation of the data and to the best of our knowledge, not much effort has been spent on mining interesting patterns from such structures. In this paper we address the problem of mining structural association of semistructured data, or in other words the discovery of structural regularities among a large database of semistructured objects. This problem is much more complicated than the classical association rule one, since complex structures in the form of a labeled hirearchical objects partially ordered has to be taken into account.

Download to read the full chapter text

Chapter PDF

Two Decades of Pattern Mining: Principles and Methods

Mining Interesting Patterns in Multi-relational Data with N-ary Relationships

Interesting Patterns

References

S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web. Morgan Kaufmann, 2000.
Google Scholar
R. Agrawal, T. Imielinski, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. In SIGMOD’93, May 1993.
Google Scholar
S. Abiteboul, and al. The Lorel Query Language for Semi-Structured Data. International Journal on Digital Libraries, 1(1):68–88, April1997.
Article MathSciNet Google Scholar
R. Agrawal and R. Srikant. Fast Algorithms for Mining Generalized Association Rules. In VLDB’94, September 1994.
Google Scholar
R. Agrawal and R. Srikant. Mining Sequential Patterns. In ICDE’95, March 1995.
Google Scholar
P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A Query Language and Optimization Techniques for Unstructured Data. In SIGMOD’96.
Google Scholar
S. Brin, R. Motwani, J.D. Ullman, and S. Tsur. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In SIGMOD’97, May 1997.
Google Scholar
L. Dehaspe, H. Toivonen and R.D. King. Fining Frequent Substructures in Chemical Coumpounds. In KDD’98, August 1998.
Google Scholar
U.M. Fayad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1996.
Google Scholar
D. Konopnicki and O. Shmueli. W3QS: A Query System for the World-Wide Web. In VLDB’95, September 1995.
Google Scholar
P.A. Laur. Recherche de regularités dans des bases de données d’objets complexes. Technical Report, LIRMM, France, June 2000.
Google Scholar
F. Masseglia, P. Poncelet and R. Cicchetti. An Efficient Algorithm for Web Usage Mining. In Networking and Information Systems Journal,October 1999.
Google Scholar
F. Masseglia, F. Cathala, and P. Poncelet. The PSP Approach for Mining Sequential Patterns. In PKDD’98, September 1998.
Google Scholar
S. Nestorov, S. Abiteboul, and R. Motwani. Extracting Schema from Semistructured Data. SIGMOD’98, 1998.
Google Scholar
R. Srikant and R. Agrawal. Mining Sequential Patterns: Generalizations and Performance Improvements. In EDBT’96, September 1996.
Google Scholar
A. Savasere, E. Omiecinski, and S. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. In VLDB’95, pages 432–444, Zurich, Switzerland, September 1995.
Google Scholar
H. Toivonen. Sampling Large Databases for Association Rules. In VLDB’96, September 1996.
Google Scholar
K. Wang and H.Q. Liu. Discovering Typical Structures of Documents: A Road Map Approach. In ACM SIGIR, August 1998.
Google Scholar
K. Wang and H. Liu. Discovering Structural Association of Semistructured Data. IEEE TKDE, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

LIRMM UMR CNRS 5506, 161, Rue Ada, 34392, Montpellier Cedex 5, France
P.A Laur, F. Masseglia & P. Poncelet
Laboratoire PRiSM, Université de Versailles, 45 Avenue des Etats-Unis, 78035, Versailles Cedex, France
F. Masseglia

Authors

P.A Laur
View author publications
You can also search for this author in PubMed Google Scholar
F. Masseglia
View author publications
You can also search for this author in PubMed Google Scholar
P. Poncelet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491, Trondheim, Norway
Jan Komorowski
Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA
Jan Żytkow
Laboratoire ERIC, Université Lyon 2, 5 avenue Pierre Mendès-France, 69676, Bron, France
Djamel A. Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laur, P., Masseglia, F., Poncelet, P. (2000). Schema Mining: Finding Structural Regularity among Semistructured Data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_57

Download citation

DOI: https://doi.org/10.1007/3-540-45372-5_57
Published: 18 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Schema Mining: Finding Structural Regularity among Semistructured Data

Abstract

Chapter PDF

Similar content being viewed by others

Two Decades of Pattern Mining: Principles and Methods

Mining Interesting Patterns in Multi-relational Data with N-ary Relationships

Interesting Patterns

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Schema Mining: Finding Structural Regularity among Semistructured Data

Abstract

Chapter PDF

Similar content being viewed by others

Two Decades of Pattern Mining: Principles and Methods

Mining Interesting Patterns in Multi-relational Data with N-ary Relationships

Interesting Patterns

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation