Nothing Special   »   [go: up one dir, main page]

Skip to main content

Finding the Most Descriptive Substructures in Graphs with Discrete and Numeric Labels

  • Conference paper
New Frontiers in Mining Complex Patterns (NFMCP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7765))

Included in the following conference series:

  • 611 Accesses

Abstract

Many graph datasets are labelled with discrete and numeric attributes. Frequent substructure discovery algorithms usually ignore numeric attributes; in this paper we show that they can be used to improve discrimination and search performance. Our thesis is that the most descriptive substructures are those which are normative both in terms of their structure and in terms of their numeric values. We explore the relationship between graph structure and the distribution of attribute values and propose an outlier-detection step, which is used as a constraint during substructure discovery. By pruning anomalous vertices and edges, more weight is given to the most descriptive substructures. Our experiments on a real-world access control database returns similar substructures to unconstrained search with 30% fewer graph isomorphism tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)

    Google Scholar 

  2. Borgelt, C.: Canonical forms for frequent graph mining. In: 30th Annual Conf. German Classification Society, GfKl 2006, pp. 337–349. Springer (2006)

    Google Scholar 

  3. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)

    Article  Google Scholar 

  4. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: A recursive model for graph mining. In: SDM 2004. SIAM (2004)

    Google Scholar 

  5. Cook, D.J., Holder, L.B.: Graph-based data mining. IEEE Intelligent Systems 15, 32–41 (2000)

    Google Scholar 

  6. Davis, M., Liu, W., Miller, P., Redpath, G.: Detecting anomalies in graphs with numeric labels. In: CIKM 2011, pp. 1197–1202. ACM (2011)

    Google Scholar 

  7. Eichinger, F., Huber, M., Böhm, K.: On the usefulness of weight-based constraints in frequent subgraph mining. In: ICAI 2010, pp. 65–78. BCS SGAI (December 2010)

    Google Scholar 

  8. Fortin, S.: The graph isomorphism problem. Tech. rep., Univ. of Alberta (1996)

    Google Scholar 

  9. Huan, J., Wang, W., Prins, J., Yang, J.: SPIN: Mining maximal frequent subgraphs from graph databases. In: KDD 2004, pp. 581–586. ACM (2004)

    Google Scholar 

  10. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. Jiang, C., Coenen, F., Zito, M.: Frequent sub-graph mining on edge weighted graphs. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 77–88. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Kriegel, H.P., Kröger, P., Zimek, A.: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery in Data 3(1), 1:1–1:58 (2009)

    Google Scholar 

  13. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM 2001, pp. 313–320. IEEE (2001)

    Google Scholar 

  14. McKay, B.D., Piperno, A.: Practical graph isomorphism, II (January 2013), http://arxiv.org/abs/1301.1493v1

  15. Moser, F., Colak, R., Rafiey, A., Ester, M.: Mining cohesive patterns from graphs with feature vectors. In: SDM, pp. 593–604 (2009)

    Google Scholar 

  16. Papadimitriou, S., Kitagawa, H., Gibbons, P., Faloutsos, C.: Loci: fast outlier detection using the local correlation integral. In: Proceedings of the 19th International Conference on Data Engineering 2003, pp. 315–326 (March 2003)

    Google Scholar 

  17. de Vries, T., Chawla, S., Houle, M.: Finding local anomalies in very high dimensional space. In: ICDM 2010, pp. 128–137. IEEE (2010)

    Google Scholar 

  18. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM 2002, pp. 721–724. IEEE (2002)

    Google Scholar 

  19. Yan, X., Han, J.: CloseGraph: Mining closed frequent graph patterns. In: KDD 2003, pp. 286–295. ACM (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Davis, M., Liu, W., Miller, P. (2013). Finding the Most Descriptive Substructures in Graphs with Discrete and Numeric Labels. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2012. Lecture Notes in Computer Science(), vol 7765. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37382-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37382-4_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37381-7

  • Online ISBN: 978-3-642-37382-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics