Extracting frequent connected subgraphs from large graph sets

Wei Wang¹,
Qing-Qing Yuan¹,
Hao-Feng Zhou¹,
Ming-Sheng Hong¹ &
…
Bai-Le Shi¹

57 Accesses
Explore all metrics

Abstract

Mining frequent patterns from datasets is one of the key success of data mining research. Currently, most of the studies focus on the data sets in which the elements are independent, such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to extract frequent patterns from these relations is the objective of this paper. The authors use graphs to model the relations, and select a simple type for analysis. Combining the graph theory and algorithms to generate frequent patterns, a new algorithm called Topology, which can mine these graphs efficiently, has been proposed. The performance of the algorithm is evaluated by doing experiments with synthetic datasets and real data. The experimental results show that Topology can do the job well. At the end of this paper, the potential improvement is mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering Correlation in Frequent Subgraphs

Mining Interesting Itemsets in Graph Datasets

Mining Graphs of Prescribed Connectivity

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Agrawal Ret al. Mining association rules between sets of items in large databases. InProc. ACM SIGMOD, Washington D C, USA, 1993, pp.207–216.
Agrawal Ret al. Fast algorithms for mining association rules in large databases. InProc. VLDB, Santiago, Chile, 1994, pp.487–499.
Park J Set al. An effective hash based algorithm for mining association rules. InProc. ACM SIGMOD, San Jose, California, USA, 1995, pp.175–186.
Brin Set al. Dynamic itemset counting and implication rules for market basket data. InProc. ACM SIGMOD, Tucson, Arizona, USA, 1997, pp.255–264.
Han Jet al. Mining frequent patterns without candidate generation. InProc. ACM SIGMOD Dallas, Texas, USA, 2000, pp.1–12.
Read R Cet al. The graph isomorphism disease.J. Graph Theory, 1977, 4: 339–363.
Article MathSciNet Google Scholar
Babai Let al. Canonical labeling of graphs. InProc. ACM STOC, Boston, Massachusetts, USA, 1983, pp.171–183.
Inokuchi Aet al. An apriori-based algorithm for mining frequent substructures from graph data. InProc. PKDD, LNCS 1910, Springer, Lyon, France, 2000, pp.13–23.
Google Scholar
Inokuchi Aet al. Applying algebraic mining method of graph substructures to mutageniesis data analysis. InKDD Challenge, PAKDD, Kyoto, Japan, 2000, pp.41–46.
Inokuchi Aet al. A fast algorithm for mining frequent connected subgraphs. Research Report RT0448, IBM Research, Tokyo Research Laboratory, 2002.
Kuramochi Met al. Frequent subgraph discovery. InProc. IEEE ICDM, San Jose California, USA, 2001, pp.313–320.
Kuramochi Met al. An efficient algorithm for discovering frequent subgraph. Technical Report 02-026, Dept. of Computer Science, University of Minnesota, 2002.
Yan Xet al. gSpan: Graph-based substructure pattern mining. InProc. IEEE ICDM, Maebashi City, Japan, 2002.
Pei Jet al. PrefixSpan: Mining sequential patterns by prefix-projected growth. InProc. ICDE, Dusseldorf, Germany, 2001, pp.215–224.
Cook D Jet al. Substructure discovery using minimum description length and background knowledge.J. Artificial Intelligence Research, 1994, 1: 231–255.
Google Scholar
Yoshida Ket al. CLIP: Concept learning from inference patterns.Artificial Intelligence, 1995, 1: 63–92.
Article Google Scholar
Motoda Het al. Machine learning techniques to make computers easier to use. InProc. IJCAI, 1997, 2: 1622–1631, Nagoya, Japan.
Matsuda Tet al. Extension of graph-based induction for general graph structured data. InProc. PAKDD, Springer, Kyoto, Japan, 2000, LNCS 1805: 420–431.
Google Scholar
Matsuda Tet al. Knowledge discovery from structured data by beam-wise graph-based induction. InProc. PRICAI, Springer, Tokyo, Japan, 2002, LNCS 2417: 255–264.
Google Scholar
Raedt L Deet al. The levelwise version space algorithm and its application to molecular fragment finding. InProc. IJCAI, Seattle, Washington, USA, 2001, 2: 853–862.
Dehaspe Let al. Finding frequent substructures in chemical compounds. InProc. KDD, New York, USA, 1998, pp.30–36.
Kramer Set al. Molecular feature mining in HIV data. InProc. ACM SIGKDD, San Francisco, USA, 2001, pp.136–143.
Weininger D. SMILES, a chemical language and information system.J. Chemical Information and Computer Sciences, 1988, 1: 31–36.
Google Scholar
James C Aet al. Daylight Theory Manual—Daylight 4.71.
Wang Xet al. Finding patterns in three-dimensional graphs: Algorithms and applications to scientific data mining.IEEE TKDE, 2002, 4: 731–749.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Technology, Fudan University, 200433, Shanghai, P.R. China
Wei Wang, Qing-Qing Yuan, Hao-Feng Zhou, Ming-Sheng Hong & Bai-Le Shi

Authors

Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qing-Qing Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Hao-Feng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Sheng Hong
View author publications
You can also search for this author in PubMed Google Scholar
Bai-Le Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Wang.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant Nos.69933030 and 60303008) and the National High-Technology Development 863 Program of China (Grant No.2002AA4Z3430).

Wei Wang received the B.S. degree in computer science in 1992 from Shandong University, the Ph.D. degree in computer science in 1998 from Fudan University, respectively. He is now a professor in Department of Computing and Information Technology, Fudan University. His research interests include database, data warehouse, data mining.

Qing-Qing Yuan received the B.S., the M.S. degrees in computer science in 2000 from Fudan University, in 2003, respectively. Now she is a Ph.D. candidate in Department of Computer Science, University of California. Santa BarBara. Her research interests include database and data mining.

Hao-Feng Zhou received the B.S. degree in computer science in 1997 from Shanghai University, the M.S. degree and the Ph.D. degree in computer science in 2000 and in 2003, from Fudan University, respectively. His research interests include database and data mining.

Ming-Sheng Hong received the B.S. degree in computer science in 2002 from Fudan University. Now she is a Ph.D. candidate in Department of Computer Science, University of Connell. His research interests include database and data mining.

Bai-Le Shi received the B.S. degree in mathematics in 1957 from Peking University. He is a professor in Department of Computing and Information Technology, Fudan University. He is also director of the Shanghai (International) Database Research Center. His research interests include database, data warehouse and digital library.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Yuan, QQ., Zhou, HF. et al. Extracting frequent connected subgraphs from large graph sets. J. Comput. Sci. & Technol. 19, 867–875 (2004). https://doi.org/10.1007/BF02973450

Download citation

Received: 23 May 2003
Revised: 28 April 2004
Published: 11 October 2008
Issue Date: December 2004
DOI: https://doi.org/10.1007/BF02973450

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discovering Correlation in Frequent Subgraphs

Mining Interesting Itemsets in Graph Datasets

Mining Graphs of Prescribed Connectivity

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Extracting frequent connected subgraphs from large graph sets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discovering Correlation in Frequent Subgraphs

Mining Interesting Itemsets in Graph Datasets

Mining Graphs of Prescribed Connectivity

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation