HDB-Subdue: A Scalable Approach to Graph Mining

Srihari Padmanabhan¹⁹ &
Sharma Chakravarthy¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5691))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

1139 Accesses

Abstract

Transactional data mining (association rules, decision trees etc.) has been effectively used to find non-trivial patterns in categorical and unstructured data. For applications that have an inherent structure (e.g., social networks, proteins), graph mining is useful since mapping the structured data into a transactional representation will lead to loss of information. Graph mining is used for identifying interesting or frequent subgraphs. Database mining uses SQL and relational representation to overcome limitations of main memory algorithms and to achieve scalability.

This paper presents a scalable, SQL-based approach to graph mining – specifically, interesting substructure discovery. The most general form of graphs including directed edges, multiple edges between nodes, and cycles are handled by our approach. Our primary goal in this work has been to address scalability, and map difficult and computationally expensive problems such as pseudo duplicate elimination, canonical labeling, and isomorphism checking into SQL-based counterparts. The notion of minimum description length (MDL) has been cast into corresponding metric for relational representation. Our experimental analysis shows that graphs with Millions of nodes and edges can be handled by the algorithm and the approach presented in this paper.

This work was supported, in part, by NSF grant IIS 0534611.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Partition and Conquer: Map/Reduce Way of Substructure Discovery

Big Data Analytics: Exploring Graphs with Optimized SQL Queries

Graph Databases: Their Power and Limitations

References

Balachandran, R., Padmanabhan, S., Chakravarthy, S.: Enhanced DB-subdue: Supporting subtle aspects of graph mining using a relational approach. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS, vol. 3918, pp. 673–678. Springer, Heidelberg (2006)
Chapter Google Scholar
Cook, D.J., Holder, L.B.: Graph-based data mining. IEEE Intelligent Systems 15(2), 32–41 (2000)
Article Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Mach. Learn. 50(3), 321–354 (2003)
Article MATH Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM 2001: Proceedings of the 2001 IEEE International Conference on Data Mining, Washington, DC, USA, pp. 313–320. IEEE Computer Society Press, Los Alamitos (2001)
Chapter Google Scholar
Mishra, P., Chakravarthy, S.: Performance evaluation and analysis of k-way join variants for association rule mining. In: BNCOD, pp. 95–114 (2003)
Google Scholar
Padmanabhan, S.: HDB-Subdue: A relational database approach to graph mining and hierarchical reduction. Master’s thesis, Department of Computer Science and Engineering, University of Texas at Arlington /Students/sharma/theses/Pad05MS.pdf (December 2005), http://itlab.uta.edu/ITLABWEB
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: Mining sequential patterns by prefix-projected growth. In: Proceedings of the 17th International Conference on Data Engineering, Washington, DC, USA, pp. 215–224. IEEE Computer Society Press, Los Alamitos (2001)
Google Scholar
Washio, T., Motoda, H.: State of the art of graph-based data mining. SIGKDD Explor. Newsl. 5(1), 59–68 (2003)
Article Google Scholar
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM 2002: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Washington, DC, USA, p. 721. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

IT Laboratory & Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX, 76019
Srihari Padmanabhan & Sharma Chakravarthy

Authors

Srihari Padmanabhan
View author publications
You can also search for this author in PubMed Google Scholar
Sharma Chakravarthy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Aalborg University, Selma Lagerlöfsvej 300, 9220, Aalborg Ø, Denmark
Torben Bach Pedersen
IBM India Research Lab, Plot No. 4, Block C, Institutional Area, Vasant Kunj, 110 070, New Delhi, India
Mukesh K. Mohania
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, 1040, Wien, Austria
A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Padmanabhan, S., Chakravarthy, S. (2009). HDB-Subdue: A Scalable Approach to Graph Mining. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2009. Lecture Notes in Computer Science, vol 5691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03730-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-03730-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03729-0
Online ISBN: 978-3-642-03730-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HDB-Subdue: A Scalable Approach to Graph Mining

Abstract

Access this chapter

Preview

Similar content being viewed by others

Partition and Conquer: Map/Reduce Way of Substructure Discovery

Big Data Analytics: Exploring Graphs with Optimized SQL Queries

Graph Databases: Their Power and Limitations

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

HDB-Subdue: A Scalable Approach to Graph Mining

Abstract

Access this chapter

Preview

Similar content being viewed by others

Partition and Conquer: Map/Reduce Way of Substructure Discovery

Big Data Analytics: Exploring Graphs with Optimized SQL Queries

Graph Databases: Their Power and Limitations

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation