Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

On analyzing graphs with motif-paths

Published: 01 February 2021 Publication History

Abstract

Path-based solutions have been shown to be useful for various graph analysis tasks, such as link prediction and graph clustering. However, they are no longer adequate for handling complex and gigantic graphs. Recently, motif-based analysis has attracted a lot of attention. A motif, or a small graph with a few nodes, is often considered as a fundamental unit of a graph. Motif-based analysis captures high-order structure between nodes, and performs better than traditional "edge-based" solutions. In this paper, we study motif-path, which is conceptually a concatenation of one or more motif instances. We examine how motif-paths can be used in three path-based mining tasks, namely link prediction, local graph clustering and node ranking. We further address the situation when two graph nodes are not connected through a motif-path, and develop a novel defragmentation method to enhance it. Experimental results on real graph datasets demonstrate the use of motif-paths and defragmentation techniques improves graph analysis effectiveness.

References

[1]
Ghadeer AbuOda, Gianmarco De Francisci Morales, and Ashraf Aboulnaga. 2019. Link prediction via higher-order motif features. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 412--429.
[2]
Nesreen K Ahmed, Jennifer Neville, Ryan A Rossi, and Nick Duffield. 2015. Efficient graphlet counting for large networks. In 2015 IEEE International Conference on Data Mining. IEEE, 1--10.
[3]
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. science 286, 5439 (1999), 509--512.
[4]
Austin R Benson, David F Gleich, and Jure Leskovec. 2015. Tensor spectral clustering for partitioning higher-order network structures. In Proceedings of the 2015 SIAM International Conference on Data Mining. SIAM, 118--126.
[5]
Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science 353, 6295 (2016), 163--166.
[6]
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology 25, 2 (2001), 163--177.
[7]
Jie Cao, Bentian Li, and Xiangquan Gui. 2016. Research on the Influence of Network Motif on Link Prediction. DEStech Transactions on Computer Science and Engineering itms (2016).
[8]
Xiaowei Chen, Yongkun Li, Pinghui Wang, and John Lui. 2016. A general framework for estimating graphlet statistics via random walk. Proceedings of the VLDB Endowment (2016).
[9]
Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online search of overlapping communities. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. 277--288.
[10]
Vachik S Dave, Nesreen K Ahmed, and Mohammad Al Hasan. 2017. E-CLoG: counting edge-centric local graphlets. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 586--595.
[11]
Yixiang Fang, Reynold Cheng, Xiaodong Li, Siqiang Luo, and Jiafeng Hu. 2017. Effective community search over large spatial graphs. Proceedings of the VLDB Endowment 10, 6 (2017), 709--720.
[12]
Yixiang Fang, Zheng Wang, Reynold Cheng, Xiaodong Li, Siqiang Luo, Jiafeng Hu, and Xiaojun Chen. 2018. On spatial-aware community search. IEEE Transactions on Knowledge and Data Engineering 31, 4 (2018), 783--798.
[13]
Anne-Claude Gavin, Patrick Aloy, Paola Grandi, Roland Krause, Markus Boesche, Martina Marzioch, Christina Rau, Lars Juhl Jensen, Sonja Bastuck, Birgit Dümpelfeld, et al. 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 7084 (2006), 631.
[14]
Xiaolin Han, Tobias Grubenmann, Reynold Cheng, Sze Chun Wong, Xiaodong Li, and Wenya Sun. 2020. Traffic Incident Detection: A Trajectory-based Approach. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1866--1869.
[15]
Tomaž Hočevar and Janez Demšar. 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559--565.
[16]
Xin Huang, Hong Cheng, Lu Qin, Wentao Tian, and Jeffrey Xu Yu. 2014. Querying k-truss community in large and dynamic graphs. In Proceedings of the 2014 ACM SIGMOD. ACM, 1311--1322.
[17]
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1 (1953), 39--43.
[18]
Arun S Konagurthu and Arthur M Lesk. 2008. On the origin of distribution patterns of motifs in biological networks. BMC Systems Biology 2, 1 (2008), 73.
[19]
Nevan J Krogan, Gerard Cagney, Haiyuan Yu, Gouqing Zhong, Xinghua Guo, Alexandr Ignatchenko, Joyce Li, Shuye Pu, Nira Datta, Aaron P Tikuisis, et al. 2006. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature (2006).
[20]
Linyuan LÃij and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6 (2011), 1150 -- 1170.
[21]
Pei-Zhen Li, Ling Huang, Chang-Dong Wang, and Jian-Huang Lai. 2019. EdMot: An Edge Enhancement Approach for Motif-aware Community Detection. In Proceedings of the 25th ACM SIGKDD. 479--487.
[22]
Xiaodong Li. 2019. DURS: A Distributed Method for k-Nearest Neighbor Search on Uncertain Graphs. In 2019 20th IEEE International Conference on Mobile Data Management (MDM). IEEE, 377--378.
[23]
Xiaodong Li, Reynold Cheng, Kevin Chang, Caihua Shan, Chenhao Ma, and Hongtai Cao. 2021. Code: On Analyzing Graphs with Motif-Paths. (2021). https://github.com/li20mp/motif-path.
[24]
Xiaodong Li, Reynold Cheng, Kevin Chang, Caihua Shan, Chenhao Ma, and Hongtai Cao. 2021. Supplemental Material: On Analyzing Graphs with Motif-Paths. (2021). https://i.cs.hku.hk/~xdli/mpath-sup.pdf.
[25]
Xiaodong Li, Reynold Cheng, Yixiang Fang, Jiafeng Hu, and Silviu Maniu. 2018. Scalable evaluation of k-nn queries on large uncertain graphs. In 21st International Conference on Extending Database Technology (EDBT). 181--192.
[26]
Xiaodong Li, Reynold Cheng, Matin Najafi, Kevin Chang, Xiaolin Han, and Hongtai Cao. 2020. M-Cypher: A GQL Framework Supporting Motifs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM). 3433--3436.
[27]
Guanfeng Liu, Yan Wang, and Mehmet A. Orgun. 2010. Optimal Social Trust Path Selection in Complex Social Networks. In Proceedings of the Twenty-Fourth AAAI 2010.
[28]
Guanfeng Liu, Yan Wang, Mehmet A. Orgun, and Ee-Peng Lim. 2013. Finding the Optimal Social Trust Path for the Selection of Trustworthy Service Providers in Complex Social Networks. IEEE Trans. Services Computing (2013).
[29]
Linyuan Lü, Ci-Hang Jin, and Tao Zhou. 2009. Similarity index based on local paths for link prediction of complex networks. Physical Review E 80, 4 (2009), 046122.
[30]
Chenhao Ma, Reynold Cheng, Laks VS Lakshmanan, Tobias Grubenmann, Yixiang Fang, and Xiaodong Li. 2019. LINC: a motif counting algorithm for uncertain graphs. Proceedings of the VLDB Endowment 13, 2 (2019), 155--168.
[31]
Markus Maier, Ulrike V Luxburg, and Matthias Hein. 2009. Influence of graph construction on graph-based clustering measures. In Advances in neural information processing systems. 1025--1032.
[32]
Ine Melckenbeeck, Pieter Audenaert, Didier Colle, and Mario Pickavet. 2018. Efficiently counting all orbits of graphlets of any order in a graph using autogenerated equations. Bioinformatics 34, 8 (2018), 1372--1380.
[33]
Giovanni Micale, Rosalba Giugno, Alfredo Ferro, Misael Mongioví, Dennis Shasha, and Alfredo Pulvirenti. 2018. Fast analytical methods for finding significant labeled graph motifs. Data Mining and Knowledge Discovery 32, 2 (2018), 504--531.
[34]
Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science 298, 5594 (2002), 824--827.
[35]
Paul Monagle, Anthony KC Chan, Neil A Goldenberg, Rebecca N Ichord, Janna M Journeycake, Ulrike Nowak-Göttl, and Sara K Vesely. 2012. Antithrombotic therapy in neonates and children: antithrombotic therapy and prevention of thrombosis: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest (2012).
[36]
Kirill Paramonov and James Sharpnack. 2019. Estimating Graphlet Statistics via Lifting. SIGKDD (2019).
[37]
Ali Pinar, C Seshadhri, and Vaidyanathan Vishal. 2017. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1431--1440.
[38]
Kenneth H Rosen and Kamala Krithivasan. 2012. Discrete mathematics and its applications: with combinatorics and graph theory. Tata McGraw-Hill Education.
[39]
Ngoc Hieu Tran, Kwok Pui Choi, and Louxin Zhang. 2013. Counting motifs in the human interactome. Nature communications 4 (2013), 2241.
[40]
Charalampos E Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable motif-aware graph clustering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1451--1460.
[41]
Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Jing Tao, and Xiaohong Guan. 2018. SNOD: a fast sampling method of exploring node orbit degrees for large graphs. Knowledge and Information Systems (2018), 1--26.
[42]
Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 1 (2015), 181--213.
[43]
Yabing Yao, Ruisheng Zhang, Fan Yang, Jianxin Tang, Yongna Yuan, and Rongjing Hu. 2018. Link prediction in complex networks based on the interactions among paths. Physica A: Statistical Mechanics and its Applications 510 (2018), 52--67.
[44]
Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering. In Proceedings of the 23rd ACM SIGKDD. ACM, 555--564.
[45]
Huan Zhao, Xiaogang Xu, Yangqiu Song, Dik Lun Lee, Zhao Chen, and Han Gao. 2018. Ranking users in social networks with higher-order structures. In Thirty-Second AAAI Conference on Artificial Intelligence.

Cited By

View all
  • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
  • (2024)ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language ModelProceedings of the VLDB Endowment10.14778/3654621.365464017:7(1765-1774)Online publication date: 1-Mar-2024
  • (2024)Accelerating directed densest subgraph queries with software and hardware approachesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00805-033:1(207-230)Online publication date: 1-Jan-2024
  • Show More Cited By
  1. On analyzing graphs with motif-paths

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 14, Issue 6
    February 2021
    261 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 February 2021
    Published in PVLDB Volume 14, Issue 6

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)36
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 19 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
    • (2024)ZeroEA: A Zero-Training Entity Alignment Framework via Pre-Trained Language ModelProceedings of the VLDB Endowment10.14778/3654621.365464017:7(1765-1774)Online publication date: 1-Mar-2024
    • (2024)Accelerating directed densest subgraph queries with software and hardware approachesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00805-033:1(207-230)Online publication date: 1-Jan-2024
    • (2022)Finding locally densest subgraphsProceedings of the VLDB Endowment10.14778/3551793.355182615:11(2719-2732)Online publication date: 1-Jul-2022
    • (2022)DeepTEAProceedings of the VLDB Endowment10.14778/3523210.352322515:7(1493-1505)Online publication date: 22-Jun-2022
    • (2022)sGrow: Explaining the Scale-Invariant Strength Assortativity of Streaming ButterfliesACM Transactions on the Web10.1145/357240817:3(1-46)Online publication date: 14-Dec-2022
    • (2022)A Convex-Programming Approach for Efficient Directed Densest Subgraph DiscoveryProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517837(845-859)Online publication date: 10-Jun-2022
    • (2022)Novel features for time series analysis: a complex networks approachData Mining and Knowledge Discovery10.1007/s10618-022-00826-336:3(1062-1101)Online publication date: 1-May-2022
    • (2021)Reinforcement learning enhanced explainer for graph neural networksProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541986(22523-22533)Online publication date: 6-Dec-2021
    • (2021)On Directed Densest Subgraph DiscoveryACM Transactions on Database Systems10.1145/348394046:4(1-45)Online publication date: 15-Nov-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media