research-article

On analyzing graphs with motif-paths

Authors:

Kevin Chen-Chuan Chang,

Hongtai CaoAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 14, Issue 6

Pages 1111 - 1123

https://doi.org/10.14778/3447689.3447714

Published: 01 February 2021 Publication History

Abstract

Path-based solutions have been shown to be useful for various graph analysis tasks, such as link prediction and graph clustering. However, they are no longer adequate for handling complex and gigantic graphs. Recently, motif-based analysis has attracted a lot of attention. A motif, or a small graph with a few nodes, is often considered as a fundamental unit of a graph. Motif-based analysis captures high-order structure between nodes, and performs better than traditional "edge-based" solutions. In this paper, we study motif-path, which is conceptually a concatenation of one or more motif instances. We examine how motif-paths can be used in three path-based mining tasks, namely link prediction, local graph clustering and node ranking. We further address the situation when two graph nodes are not connected through a motif-path, and develop a novel defragmentation method to enhance it. Experimental results on real graph datasets demonstrate the use of motif-paths and defragmentation techniques improves graph analysis effectiveness.

References

[1]

Ghadeer AbuOda, Gianmarco De Francisci Morales, and Ashraf Aboulnaga. 2019. Link prediction via higher-order motif features. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 412--429.

[2]

Nesreen K Ahmed, Jennifer Neville, Ryan A Rossi, and Nick Duffield. 2015. Efficient graphlet counting for large networks. In 2015 IEEE International Conference on Data Mining. IEEE, 1--10.

Digital Library

[3]

Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. science 286, 5439 (1999), 509--512.

[4]

Austin R Benson, David F Gleich, and Jure Leskovec. 2015. Tensor spectral clustering for partitioning higher-order network structures. In Proceedings of the 2015 SIAM International Conference on Data Mining. SIAM, 118--126.

[5]

Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.

[6]

Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology 25, 2 (2001), 163--177.

[7]

Jie Cao, Bentian Li, and Xiangquan Gui. 2016. Research on the Influence of Network Motif on Link Prediction. DEStech Transactions on Computer Science and Engineering itms (2016).

[8]

Xiaowei Chen, Yongkun Li, Pinghui Wang, and John Lui. 2016. A general framework for estimating graphlet statistics via random walk. Proceedings of the VLDB Endowment (2016).

Digital Library

[9]

Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online search of overlapping communities. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. 277--288.

Digital Library

[10]

Vachik S Dave, Nesreen K Ahmed, and Mohammad Al Hasan. 2017. E-CLoG: counting edge-centric local graphlets. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 586--595.

[11]

Yixiang Fang, Reynold Cheng, Xiaodong Li, Siqiang Luo, and Jiafeng Hu. 2017. Effective community search over large spatial graphs. Proceedings of the VLDB Endowment 10, 6 (2017), 709--720.

Digital Library

[12]

Yixiang Fang, Zheng Wang, Reynold Cheng, Xiaodong Li, Siqiang Luo, Jiafeng Hu, and Xiaojun Chen. 2018. On spatial-aware community search. IEEE Transactions on Knowledge and Data Engineering 31, 4 (2018), 783--798.

Digital Library

[13]

Anne-Claude Gavin, Patrick Aloy, Paola Grandi, Roland Krause, Markus Boesche, Martina Marzioch, Christina Rau, Lars Juhl Jensen, Sonja Bastuck, Birgit Dümpelfeld, et al. 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 7084 (2006), 631.

[14]

Xiaolin Han, Tobias Grubenmann, Reynold Cheng, Sze Chun Wong, Xiaodong Li, and Wenya Sun. 2020. Traffic Incident Detection: A Trajectory-based Approach. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1866--1869.

[15]

Tomaž Hočevar and Janez Demšar. 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559--565.

[16]

Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In Proceedings of the 2014 ACM SIGMOD. ACM, 1311--1322.

Digital Library

[17]

Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.

[18]

Arun S Konagurthu and Arthur M Lesk. 2008. On the origin of distribution patterns of motifs in biological networks. BMC Systems Biology 2, 1 (2008), 73.

[19]

Nevan J Krogan, Gerard Cagney, Haiyuan Yu, Gouqing Zhong, Xinghua Guo, Alexandr Ignatchenko, Joyce Li, Shuye Pu, Nira Datta, Aaron P Tikuisis, et al. 2006. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature (2006).

[20]

Linyuan LÃij and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6 (2011), 1150 -- 1170.

[21]

Pei-Zhen Li, Ling Huang, Chang-Dong Wang, and Jian-Huang Lai. 2019. EdMot: An Edge Enhancement Approach for Motif-aware Community Detection. In Proceedings of the 25th ACM SIGKDD. 479--487.

Digital Library

[22]

Xiaodong Li. 2019. DURS: A Distributed Method for k-Nearest Neighbor Search on Uncertain Graphs. In 2019 20th IEEE International Conference on Mobile Data Management (MDM). IEEE, 377--378.

[23]

Xiaodong Li, Reynold Cheng, Kevin Chang, Caihua Shan, Chenhao Ma, and Hongtai Cao. 2021. Code: On Analyzing Graphs with Motif-Paths. (2021). https://github.com/li20mp/motif-path.

Digital Library

[24]

Xiaodong Li, Reynold Cheng, Kevin Chang, Caihua Shan, Chenhao Ma, and Hongtai Cao. 2021. Supplemental Material: On Analyzing Graphs with Motif-Paths. (2021). https://i.cs.hku.hk/~xdli/mpath-sup.pdf.

Digital Library

[25]

Xiaodong Li, Reynold Cheng, Yixiang Fang, Jiafeng Hu, and Silviu Maniu. 2018. Scalable evaluation of k-nn queries on large uncertain graphs. In 21st International Conference on Extending Database Technology (EDBT). 181--192.

[26]

Xiaodong Li, Reynold Cheng, Matin Najafi, Kevin Chang, Xiaolin Han, and Hongtai Cao. 2020. M-Cypher: A GQL Framework Supporting Motifs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM). 3433--3436.

Digital Library

[27]

Guanfeng Liu, Yan Wang, and Mehmet A. Orgun. 2010. Optimal Social Trust Path Selection in Complex Social Networks. In Proceedings of the Twenty-Fourth AAAI 2010.

Digital Library

[28]

Guanfeng Liu, Yan Wang, Mehmet A. Orgun, and Ee-Peng Lim. 2013. Finding the Optimal Social Trust Path for the Selection of Trustworthy Service Providers in Complex Social Networks. IEEE Trans. Services Computing (2013).

Digital Library

[29]

Linyuan Lü, Ci-Hang Jin, and Tao Zhou. 2009. Similarity index based on local paths for link prediction of complex networks. Physical Review E 80, 4 (2009), 046122.

[30]

Chenhao Ma, Reynold Cheng, Laks VS Lakshmanan, Tobias Grubenmann, Yixiang Fang, and Xiaodong Li. 2019. LINC: a motif counting algorithm for uncertain graphs. Proceedings of the VLDB Endowment 13, 2 (2019), 155--168.

Digital Library

[31]

Markus Maier, Ulrike V Luxburg, and Matthias Hein. 2009. Influence of graph construction on graph-based clustering measures. In Advances in neural information processing systems. 1025--1032.

Digital Library

[32]

Ine Melckenbeeck, Pieter Audenaert, Didier Colle, and Mario Pickavet. 2018. Efficiently counting all orbits of graphlets of any order in a graph using autogenerated equations. Bioinformatics 34, 8 (2018), 1372--1380.

[33]

Giovanni Micale, Rosalba Giugno, Alfredo Ferro, Misael Mongioví, Dennis Shasha, and Alfredo Pulvirenti. 2018. Fast analytical methods for finding significant labeled graph motifs. Data Mining and Knowledge Discovery 32, 2 (2018), 504--531.

Digital Library

[34]

Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science 298, 5594 (2002), 824--827.

[35]

Paul Monagle, Anthony KC Chan, Neil A Goldenberg, Rebecca N Ichord, Janna M Journeycake, Ulrike Nowak-Göttl, and Sara K Vesely. 2012. Antithrombotic therapy in neonates and children: antithrombotic therapy and prevention of thrombosis: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest (2012).

[36]

Kirill Paramonov and James Sharpnack. 2019. Estimating Graphlet Statistics via Lifting. SIGKDD (2019).

Digital Library

[37]

Ali Pinar, C Seshadhri, and Vaidyanathan Vishal. 2017. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1431--1440.

Digital Library

[38]

Kenneth H Rosen and Kamala Krithivasan. 2012. Discrete mathematics and its applications: with combinatorics and graph theory. Tata McGraw-Hill Education.

[39]

Ngoc Hieu Tran, Kwok Pui Choi, and Louxin Zhang. 2013. Counting motifs in the human interactome. Nature communications 4 (2013), 2241.

[40]

Charalampos E Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable motif-aware graph clustering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1451--1460.

Digital Library

[41]

Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Jing Tao, and Xiaohong Guan. 2018. SNOD: a fast sampling method of exploring node orbit degrees for large graphs. Knowledge and Information Systems (2018), 1--26.

[42]

Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 1 (2015), 181--213.

Digital Library

[43]

Yabing Yao, Ruisheng Zhang, Fan Yang, Jianxin Tang, Yongna Yuan, and Rongjing Hu. 2018. Link prediction in complex networks based on the interactions among paths. Physica A: Statistical Mechanics and its Applications 510 (2018), 52--67.

[44]

Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering. In Proceedings of the 23rd ACM SIGKDD. ACM, 555--564.

Digital Library

[45]

Huan Zhao, Xiaogang Xu, Yangqiu Song, Dik Lun Lee, Zhao Chen, and Han Gao. 2018. Ranking users in social networks with higher-order structures. In Thirty-Second AAAI Conference on Artificial Intelligence.

Cited By

Zhou YFang YMa CHou THuang X(2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681975
Huo NCheng RKao BNing WHaldar NLi XLi JNajafi MLi TQu G(2024)ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language ModelProceedings of the VLDB Endowment10.14778/3654621.365464017:7(1765-1774)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654640
Ma CFang YCheng RLakshmanan LHan XLi X(2024)Accelerating directed densest subgraph queries with software and hardware approachesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00805-033:1(207-230)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s00778-023-00805-0
Show More Cited By

On analyzing graphs with motif-paths
1. Information systems
  1. Information systems applications

Recommendations

Packing 3-vertex paths in claw-free graphs and related topics

A @L-factor of a graph G is a spanning subgraph of G whose every component is a 3-vertex path. Let v(G) be the number of vertices of G and @c(G) the domination number of G. A claw is a graph with four vertices and three edges incident to the same ...
Decomposition of complete graphs into paths and stars

Let P"k"+"1 denote a path of length k and let S"k"+"1 denote a star with k edges. As usual K"n denotes the complete graph on n vertices. In this paper we investigate the decomposition of K"n into paths and stars, and prove the following results. Theorem ...
Hamiltonian and long paths in bipartite graphs with connectivity
Abstract
Let G be a graph, ν ( G ) the order of G, κ ( G ) the connectivity of G and k a positive integer such that k ≤ ( ν ( G ) − 2 ) / 2. Then G is said to be k-extendable if it has a matching of size k and every matching of size k extends ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 14, Issue 6

February 2021

261 pages

ISSN:2150-8097

Editors:
Xin Luna Dong
Amazon
,
Felix Naumann
HPI, University of Potsdam

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 February 2021

Published in PVLDB Volume 14, Issue 6

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
296
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou YFang YMa CHou THuang X(2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681975
Huo NCheng RKao BNing WHaldar NLi XLi JNajafi MLi TQu G(2024)ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language ModelProceedings of the VLDB Endowment10.14778/3654621.365464017:7(1765-1774)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654640
Ma CFang YCheng RLakshmanan LHan XLi X(2024)Accelerating directed densest subgraph queries with software and hardware approachesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00805-033:1(207-230)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s00778-023-00805-0
Ma CCheng RLakshmanan LHan X(2022)Finding locally densest subgraphsProceedings of the VLDB Endowment10.14778/3551793.355182615:11(2719-2732)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551826
Han XCheng RMa CGrubenmann T(2022)DeepTEAProceedings of the VLDB Endowment10.14778/3523210.352322515:7(1493-1505)Online publication date: 22-Jun-2022
https://dl.acm.org/doi/10.14778/3523210.3523225
Sheshbolouki AÖzsu M(2022)sGrow: Explaining the Scale-Invariant Strength Assortativity of Streaming ButterfliesACM Transactions on the Web10.1145/357240817:3(1-46)Online publication date: 14-Dec-2022
https://dl.acm.org/doi/10.1145/3572408
Ma CFang YCheng RLakshmanan LHan XIves ZBonifati AEl Abbadi A(2022)A Convex-Programming Approach for Efficient Directed Densest Subgraph DiscoveryProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517837(845-859)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517837
Silva VSilva MRibeiro PSilva F(2022)Novel features for time series analysis: a complex networks approachData Mining and Knowledge Discovery10.1007/s10618-022-00826-336:3(1062-1101)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1007/s10618-022-00826-3
Shan CShen YZhang YLi XLi DRanzato MBeygelzimer ADauphin YLiang PVaughan J(2021)Reinforcement learning enhanced explainer for graph neural networksProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541986(22523-22533)Online publication date: 6-Dec-2021
https://dl.acm.org/doi/10.5555/3540261.3541986
Ma CFang YCheng RLakshmanan LZhang WLin X(2021)On Directed Densest Subgraph DiscoveryACM Transactions on Database Systems10.1145/348394046:4(1-45)Online publication date: 15-Nov-2021
https://dl.acm.org/doi/10.1145/3483940

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents