Nothing Special   »   [go: up one dir, main page]

skip to main content
survey
Public Access

Graph Summarization Methods and Applications: A Survey

Published: 22 June 2018 Publication History

Abstract

While advances in computing resources have made processing enormous amounts of data possible, human ability to identify patterns in such data has not scaled accordingly. Efficient computational methods for condensing and simplifying data are thus becoming vital for extracting actionable insights. In particular, while data summarization techniques have been studied extensively, only recently has summarizing interconnected data, or graphs, become popular. This survey is a structured, comprehensive overview of the state-of-the-art methods for summarizing graph data. We first broach the motivation behind and the challenges of graph summarization. We then categorize summarization approaches by the type of graphs taken as input and further organize each category by core methodology. Finally, we discuss applications of summarization on real-world graphs and conclude by describing some open problems in the field.

References

[1]
Bijaya Adhikari, Yao Zhang, Aditya Bharadwaj, and B. Aditya Prakash. 2017. Condensing temporal networks using propagation. In Proceedings of the 2017 SIAM International Conference on Data Mining. 417--425
[2]
Charu Aggarwal and Karthik Subbian. 2014. Evolutionary network analysis: A survey. ACM Comput. Surv. 47, 1 (2014), 10:1--10:36.
[3]
Charu C. Aggarwal. 2015. Data Mining: The Textbook. Springer.
[4]
Charu C. Aggarwal and S. Yu Philip. 2005. Online analysis of community evolution in data streams. In Proceedings of the SIAM international Conference on Data Mining (SDM’05).
[5]
Charu C. Aggarwal and Haixun Wang. 2010. A survey of clustering algorithms for graph data. In Managing and Mining Graph Data. Springer, 275--301.
[6]
Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski, and Alexander J. Smola. 2013. Distributed large-scale natural graph factorization. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 37--48.
[7]
Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. 2012. Graph sketches: Sparsification, spanners, and subgraphs. In Proceedings of the 31st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 5--14.
[8]
Sebastian E. Ahnert. 2013. Power graph compression reveals dominant relationships in genetic transcription networks. Molec. BioSyst. 9, 11 (2013), 2681--2685.
[9]
Alfred V. Aho, M. R. Garey, and Jeffrey D. Ullman. 1972. The transitive reduction of a directed graph. Siam J. Comput. 1, 2 (1972), 131--137.
[10]
Leman Akoglu, Duen Horng Chau, U. Kang, Danai Koutra, and Christos Faloutsos. 2012. OPAvion: Mining and visualization in large graphs. In Proceedings of the 2012 SIGMOD Conference. ACM, 717--720.
[11]
Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. OddBall: Spotting anomalies in weighted graphs. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’10).
[12]
Leman Akoglu, Hanghang Tong, Jilles Vreeken, and Christos Faloutsos. 2012. Fast and reliable anomaly detection in categorical data. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’12). ACM.
[13]
Charles J. Alpert, Andrew B. Kahng, and So-Zen Yao. 1999. Spectral partitioning with multiple eigenvectors. Discr. Appl. Math. 90, 1 (1999), 3--26.
[14]
Alexandr Andoni and Piotr Indyk. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51, (2008), 117--122.
[15]
Alberto Apostolico and Guido Drovandi. 2009. Graph compression by BFS. Algorithms 2, 3 (2009), 1031--1044.
[16]
Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos E. Papalexakis, and Danai Koutra. 2014. Com2: Fast automatic discovery of temporal (“comet”) communities. In Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, Vol. 8444. Springer, 271--283.
[17]
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, 44--54.
[18]
Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. 2009. Gephi: An open source software for exploring and manipulating networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media.
[19]
Joshua D. Batson, Daniel A. Spielman, Nikhil Srivastava, and Shang-Hua Teng. 2013. Spectral sparsification of graphs: Theory and algorithms. Commun. ACM 56, 8 (2013), 87--94.
[20]
Enrico Bertini and Giuseppe Santucci. 2004. By chance is not enough: Preserving relative density through non uniform sampling. In Proceedings of the Information Visualisation Conference.
[21]
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the International World Wide Web Conference. 595--602.
[22]
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D3 data-driven documents. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011), 2301--2309.
[23]
Ivan Brugere, Brian Gallagher, and Tanya Y. Berger-Wolf. 2016. Network structure inference, a survey: Motivations, methods, and applications. ACM Comput. Surv. 51, 2, Article 24. http://arxiv.org/abs/1610.00782.
[24]
Gregory Buehrer and Kumar Chellapilla. 2008. A scalable pattern mining approach to web graph compression with communities. In Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 95--106.
[25]
Gemma Casas-Garriga. 2005. Summarizing sequential data with closed partial orders. In Proceedings of the SIAM International Conference on Data Mining (SDM’05). 380--391.
[26]
Šelja Čebirić, François Goasdoué, and Ioana Manolescu. 2015. Query-oriented summarization of RDF graphs. Proc. VLDB Endow. 8, 12 (2015), 2012--2015.
[27]
Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, and Christos Faloutsos. 2004. Fully automatic cross-associations. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’04). 79--88.
[28]
Varun Chandola and Vipin Kumar. 2005. Summarization -- Compressing data into an informative representation. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’05). 98--105.
[29]
Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. 2011. Apolo: Making sense of large network data by combining rich user interaction and machine learning. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11).
[30]
Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, and Jiawei Han. 2009. Mining graph patterns efficiently via randomized summaries. Proc. VLDB Endow. 2, 1 (2009), 742--753.
[31]
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, and Prabhakar Raghavan. 2009. On compressing social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’09). 219--228.
[32]
Yongwook Choi and Wojciech Szpankowski. 2012. Compression of graphical structures: Fundamental limits, algorithms, and experiments. IEEE Trans. Inf. Theory 58, 2 (2012), 620--638.
[33]
Rudi Cilibrasi and Paul Vitányi. 2005. Clustering by compression. IEEE Trans. Inf. Theory 51, 4 (2005), 1523--1545.
[34]
Diane J. Cook and Lawrence B. Holder. 1994. Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. 1 (1994), 231--255.
[35]
Graham Cormode and S. Muthukrishnan. 2005a. An improved data stream summary: The count-min sketch and its applications. J. Algor. 55, 1 (2005), 58--75.
[36]
Graham Cormode and S. Muthukrishnan. 2005b. Summarizing and mining skewed data streams. In Proceedings of the SIAM international Conference on Data Mining (SDM’05).
[37]
Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of interest. In Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis. 105--114.
[38]
Uros Damnjanovic, Virginia Fernandez Arguedas, Ebroul Izquierdo, and José M. Martínez. 2008. Event detection and clustering for surveillance video summarization. In Proceedingsof the 9th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS’08). IEEE, 63--66.
[39]
Pedro O. S. Vaz de Melo, Leman Akoglu, Christos Faloutsos, and Antonio Alfredo Ferreira Loureiro. 2010. Surprising patterns for the call duration distribution of mobile phone users. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’10). 354--369.
[40]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI’04). 10.
[41]
Pravallika Devineni, Danai Koutra, Michalis Faloutsos, and Christos Faloutsos. 2015. If walls could talk: Patterns and anomalies in Facebook wallposts. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15). 367--374.
[42]
Inderjit Dhillon, Yuqiang Guan, and Brian Kulis. 2005. A fast kernel-based multilevel algorithm for graph clustering. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). ACM, 629--634.
[43]
Laxman Dhulipala, Igor Kabiljo, Brian Karrer, Giuseppe Ottaviano, Sergey Pupyrev, and Alon Shalita. 2016. Compressing graphs and indexes with recursive graph bisection. arXiv:1602.08820.
[44]
P. Drineas, R. Kannan, and M. W. Mahoney. 2006. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM J. Comput. 36, 1 (2006), 184--206.
[45]
Cody Dunne and Ben Shneiderman. 2013. Motif simplification: Improving network visualization readability with fan, connector, and clique glyphs. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’13). ACM, 3247--3256.
[46]
P. Elias. 2006. Universal codeword sets and representations of the integers. IEEE Trans. Inf. Theor. 21, 2 (2006), 194--203.
[47]
Wenfei Fan, Jianzhong Li, Xin Wang, and Yinghui Wu. 2012. Query preserving graph compression. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 157--168.
[48]
Wenfei Fan, Xin Wang, and Yinghui Wu. 2013. Diversified top-k graph pattern matching. Proc. VLDB Endow. 6, 13 (2013), 1510--1521.
[49]
Jing Feng, Xiao He, Nina Hubig, Christian Böhm, and Claudia Plant. 2013. Compression-based graph mining exploiting structure primitives. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’13). IEEE, 181--190.
[50]
Jure Ferlez, Christos Faloutsos, Jure Leskovec, Dunja Mladenic, and Marko Grobelnik. 2008. Monitoring network evolution using MDL. In Proceedings of the 24th International Conference on Data Engineering (ICDE’08). 1328--1330.
[51]
Wenjie Fu, Le Song, and Eric P. Xing. 2009. Dynamic mixed membership blockmodel for evolving networks. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, 329--336.
[52]
Johannes Gehrke, Edward Lui, and Rafael Pass. 2003. Towards privacy for social networks: A zero-knowledge based definition of privacy. In Proceedings of the 8th Conference on Theory of Cryptography. 432--449.
[53]
Mina Ghashami, Edo Liberty, and Jeff M. Phillips. 2016. Efficient frequent directions algorithm for sparse matrices. arXiv:1602.00412.
[54]
Sean Gilpin, Tina Eliassi-Rad, and Ian Davidson. 2013. Guided learning for role discovery (gLRD): Framework, algorithms, and applications. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, 113--121.
[55]
Oshini Goonetilleke, Danai Koutra, Timos Sellis, and Kewen Liao. 2017. Edge labeling schemes for graph data. In Proceedings of the Scientific and Statistical Database Management Conference (SSDBM’17). ACM, Article 12, 12 pages.
[56]
Robert Görke, Pascal Maillard, Christian Staudt, and Dorothea Wagner. 2010. Modularity-driven clustering of dynamic graphs. In Proceedings of the 9th International Symposium on Experimental Algorithms (SEA’10). 436--448.
[57]
Szymon Grabowski and Wojciech Bieniecki. 2014. Tight and simple web graph compression for forward and reverse neighbor queries. Discr. Appl. Math. 163 (2014), 298--306.
[58]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16). ACM.
[59]
Carlos Guestrin, Daphne Koller, Chris Gearhart, and Neal Kanodia. 2003. Generalizing plans to new environments in relational MDPs. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03).
[60]
Mohammad Al Hasan, Nesreen K. Ahmed, and Jennifer Neville. 2013. Network Sampling: Methods and Applications. Retrieved from https://www.cs.purdue.edu/homes/neville/courses/NetworkSampling-KDD13-final.pdf.
[61]
Nasrin Hassanlou, Maryam Shoaran, and Alex Thomo. 2013. Probabilistic graph summarization. In Web-Age Information Management. Springer, 545--556.
[62]
Mark Heimann, Haoming Shen, and Danai Koutra. 2018. Node representation learning for multiple networks: The case of graph alignment. arXiv:1802.06257.
[63]
Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. RolX: Structural role extraction 8 mining in large graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 1231--1239.
[64]
Keith Henderson, Brian Gallagher, Lei Li, Leman Akoglu, Tina Eliassi-Rad, Hanghang Tong, and Christos Faloutsos. 2011. It’s who you know: Graph mining using recursive structural features. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, 663--671.
[65]
Wilko Henecka and Matthew Roughan. 2015. Lossy compression of dynamic, weighted graphs. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud’15). 427--434.
[66]
Shawndra Hill, Deepak Agarwal, Robert Bell, and Chris Volinsky. 2006. Building an effective representation for dynamic networks. J. Comput. Graph. Stat. 15 (2006), 1--25.
[67]
Christian Hübler, Hans-Peter Kriegel, Karsten Borgwardt, and Zoubin Ghahramani. 2008. Metropolis algorithms for representative subgraph sampling. In Proceedings of the 2008 8th IEEE International Conference on Data Mining (ICDM’08). IEEE, 283--292.
[68]
Di Jin and Danai Koutra. 2017a. Exploratory analysis of graph data by leveraging domain knowledge. In Proceedings of the 2017 IEEE International Conference on Data Mining. 187--196.
[69]
Di Jin, Aristotelis Leventidis, Haoming Shen, Ruowang Zhang, Junyue Wu, and Danai Koutra. 2017. PERSEUS-HUB: Interactive and collective exploration of large-scale graphs. Informatics 4, 3 (2017), 22.
[70]
Lisa Jin and Danai Koutra. 2017b. ECOviz: Comparative visualization of time-evolving network summaries. In Proceedings of the ACM SIGKDD 2017 Workshop on Interactive Data Exploration and Analytics.
[71]
U. Kang, Jay-Yoon Lee, Danai Koutra, and Christos Faloutsos. 2014. Net-ray: Visualizing and mining web-scale graphs. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’14).
[72]
U. Kang, Hanghang Tong, Jimeng Sun, Ching-Yung Lin, and Christos Faloutsos. 2011. Gbase: A scalable and general graph management system. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, 1091--1099.
[73]
U. Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. 2009. PEGASUS: A peta-scale graph mining system—implementation and observations. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’09).
[74]
George Karypis and Vipin Kumar. 1999. Multilevel k-way hypergraph partitioning. In Proceedings of the 36th Annual ACM/IEEE Design Automation Conference. 343--348.
[75]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM.
[76]
Kifayat Ullah Khan, Waqas Nawaz, and Young-Koo Lee. 2014. Set-based unified approach for attributed graph summarization. In Proceedings of the IEEE 4th International Conference on Big Data and Cloud Computing (BdCloud’14). IEEE, 378--385.
[77]
Mikko Kivel, Alex Arenas, Marc Barthelemy, James P. Gleeson, Yamir Moreno, and Mason A. Porter. 2014. Multilayer networks. J. Complex Netw. 2, 3 (2014), 203--271.
[78]
Arne Koopman and Arno Siebes. 2008. Discovering relational items sets efficiently. In Proceedings of the SIAM International Conference on Data Mining (SDM’08). 108--119.
[79]
Arne Koopman and Arno Siebes. 2009. Characteristic relational patterns. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’09). 437--446.
[80]
Danai Koutra, Abhilash Dighe, Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, Christos Faloutsos, and Jean Bolot. 2017. PNP: Fast path ensemble method for movie design. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’17).
[81]
Danai Koutra and Christos Faloutsos. 2017. Individual and Collective Graph Mining: Principles, Algorithms, and Applications. Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan 8 Claypool.
[82]
Danai Koutra, Di Jin, Yuanchi Ning, and Christos Faloutsos. 2015. Perseus: An interactive large-scale graph mining and visualization tool. Proc. VLDB Endow. 8, 12, 1924--1927.
[83]
Danai Koutra, U. Kang, Jilles Vreeken, and Christos Faloutsos. 2014b. VoG: Summarizing and understanding large graphs. In Proceedings of the SIAM international Conference on Data Mining (SDM’14). 91--99.
[84]
Danai Koutra, U. Kang, Jilles Vreeken, and Christos Faloutsos. 2014a. VoG: Summarizing and understanding large graphs. In Proceedings of the SIAM international Conference on Data Mining (SDM’14). SIAM.
[85]
Danai Koutra, Vasileios Koutras, B. Aditya Prakash, and Christos Faloutsos. 2013. Patterns amongst competing task frequencies: Super-linearities, and the almond-DG model. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13). 201--212.
[86]
Danai Koutra, Evangelos E. Papalexakis, and Christos Faloutsos. 2012. TensorSplat: Spotting latent anomalies in time. In Proceedings of the 2012 16th Panhellenic Conference on Informatics (PCI’12). IEEE, 144--149.
[87]
Danai Koutra, Neil Shah, Joshua Vogelstein, Brian Gallagher, and Christos Faloutsos. 2015. DeltaCon: A principled massive-graph similarity function with attribution. ACM Trans. Knowl. Discov. Data 10, 3, Article 28.
[88]
Danai Koutra, Joshua Vogelstein, and Christos Faloutsos. 2013. DeltaCon: A principled massive-graph similarity function. In Proceedings of the SIAM International Conference on Data Mining (SDM’13). 162--170.
[89]
Luigi Laura, Stefano Leonardi, Guido Caldarelli, and Paolo De Los Rios. 2002. A multi-layer model for the web graph. In On-Line Proceedings of the 2nd International Workshop on Web Dynamics.
[90]
Matthijs Leeuwen van Leeuwen, Jilles Vreeken, and Arno Siebes. 2006. Compression picks the item sets that matter. In Proceedings of the Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06). 585--592.
[91]
Kristen LeFevre and Evimaria Terzi. 2010. GraSS: Graph structure summarization. In Proceedings of the SIAM International Conference on Data Mining (SDM’10). 454--465.
[92]
Jure Leskovec, Lars Backstrom, Ravi Kumar, and Andrew Tomkins. 2008. Microscopic evolution of social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’08). 462--470.
[93]
Jure Leskovec and Christos Faloutsos. 2007. Scalable modeling of real graphs using Kronecker multiplication. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 497--504.
[94]
Jure Leskovec, Jon M. Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 177--187.
[95]
Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of Massive Datasets. Cambridge University Press.
[96]
Cheng-Te Li and Shou-De Lin. 2009. Egocentric information abstraction for heterogeneous social networks. In Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’09). IEEE, 255--260.
[97]
Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, 581--588.
[98]
Yongsub Lim, U. Kang, and Christos Faloutsos. 2014. SlashBurn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26, 12 (2014), 3077--3089.
[99]
Shou-De Lin, Mi-Yen Yeh, and Cheng-Te Li. 2013. Sampling and summarization for social networks. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13).
[100]
Xuemin Lin, Qing Liu, Yidong Yuan, and Xiaofang Zhou. 2003. Multiscale histograms: Summarizing topological relations in large spatial datasets. In Proceedings of the International Conference on Very Large Databases (VLDB’03). 814--825.
[101]
Yu-Ru Lin, Hari Sundaram, and Aisling Kelliher. 2008. Summarization of social activity over time: People, actions and concepts in dynamic networks. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’08). 1379--1380.
[102]
Bing Liu, Wynne Hsu, and Yiming Ma. 1999. Pruning and summarizing the discovered associations. In Proceedings of the 5th ACM SIGKDD International Conference Knowledge Discovery and Data Mining (KDD’99). 145--154.
[103]
Chunyang Liu and Ling Chen. 2016. Summarizing uncertain transaction databases by probabilistic tiles. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’16). IEEE, 4375--4382.
[104]
Wei Liu, Andrey Kan, Jeffrey Chan, James Bailey, Christopher Leckie, Jian Pei, and Ramamohanarao Kotagiri. 2012. On compressing weighted time-evolving graphs. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). ACM, 2319--2322.
[105]
Yike Liu, Neil Shah, and Danai Koutra. 2015. An empirical comparison of the summarization power of graph clustering methods. arXiv:1511.06820.
[106]
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716--727.
[107]
Antonio Maccioni and Daniel J. Abadi. 2016. Scalable pattern matching over compressed graphs via dedensification. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, 1755--1764.
[108]
Arun S. Maiya and Tanya Y. Berger-Wolf. 2010. Sampling community structure. In Proceedings of the 25th International Conference Conference on the World Wide Web (WWW’10). ACM, 701--710.
[109]
M. Mampaey, J. Vreeken, and N. Tatti. 2011. Summarizing Data with Itemsets Using Maximum Entropy Models. Technical Report 2011/02. University of Antwerp.
[110]
Sebastian Maneth and Fabian Peternek. 2016. Compressing graphs by grammars. In Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE’16). IEEE, 109--120.
[111]
Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining. 203--210.
[112]
Hossein Maserrat and Jian Pei. 2010. Neighbor query friendly compression of social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’10).
[113]
Hossein Maserrat and Jian Pei. 2012. Community preserving lossy compression of social networks. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’12). IEEE, 509--518.
[114]
Michael Mathioudakis, Francesco Bonchi, Carlos Castillo, Aristides Gionis, and Antti Ukkonen. 2011. Sparsification of influence networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 529--537.
[115]
Yasir Mehmood, Nicola Barbieri, Francesco Bonchi, and Antti Ukkonen. 2013. Csi: Community-level social influence analysis. In Machine Learning and Knowledge Discovery in Databases. Springer, 48--63.
[116]
Pauli Miettinen and Jilles Vreeken. 2011. Model order selection for boolean matrix factorization. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 51--59.
[117]
Pauli Miettinen and Jilles Vreeken. 2014. mdl4bmf: Minimum description length for Boolean matrix factorization. ACM Trans. Knowl. Discov. Data 8, 4 (2014), 1--30.
[118]
Saket Navlakha, Rajeev Rastogi, and Nisheeth Shrivastava. 2008. Graph summarization with bounded error. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08). 419--432.
[119]
Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 026113+.
[120]
Carlos Ordonez, Norberto Ezquerra, and Cesar A. Santana. 2006. Constraining and summarizing association rules in medical data. Knowl. Inf. Syst. 9, 3 (2006), 259--283.
[121]
Themis Palpanas, Michail Vlachos, Eamonn J. Keogh, and Dimitrios Gunopulos. 2008. Streaming time series summarization using user-defined amnesic functions. IEEE Trans. Knowl. Data Eng. 20, 7 (2008), 992--1006.
[122]
Jia-Yu Pan, Hyung-Jeong Yang, and Christos Faloutsos. 2004. MMSS: Multi-modal story-oriented video summarization. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’04). 491--494.
[123]
Jian Pei, Daxin Jiang, and Aidong Zhang. 2005. On mining cross-graph quasi-cliques. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). 228--238.
[124]
David Peleg and Alejandro A. Schäffer. 1989. Graph spanners. J. Graph Theory 13, 1 (1989), 99--116.
[125]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, 701--710.
[126]
Robert Pienta, Acar Tamersoy, Hanghang Tong, and Duen Horng Chau. 2014. MAGE: Matching approximate patterns in richly-attributed graphs. In Proceedings of the 2014 IEEE International Conference on Big Data. 585--590.
[127]
B. Aditya Prakash, Jilles Vreeken, and Christos Faloutsos. 2012. Spotting culprits in epidemics: How many and which ones? In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’12). IEEE.
[128]
M. Purohit, B. A. Prakash, C. Kang, Y. Zhang, and V. S. Subrahmanian. 2014. Fast influence-based coarsening for large networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, 1296--1305.
[129]
Qiang Qu, Siyuan Liu, Christian S. Jensen, Feida Zhu, and Christos Faloutsos. 2014. Interestingness-driven diffusion process summarization in dynamic networks. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’14). 597--613.
[130]
Davood Rafiei and Stephen Curial. 2005. Effectively visualizing large networks through sampling. In Proceedings of the 16th IEEE Visualization Conference (VIS’05). 48.
[131]
Sriram Raghavan and Hector Garcia-Molina. 2003. Representing web graphs. In Proceedings of the 2003 IEEE International Conference on Data Engineering (ICDE’03). IEEE, 405--416.
[132]
Xuguang Ren and Junhu Wang. 2015. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proc. VLDB Endow. 8, 5 (2015), 617--628.
[133]
Leonardo F. R. Ribeiro, Pedro H. P. Saverese, and Daniel R. Figueiredo. 2017. struc2vec: Learning node representations from structural identity. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’17). ACM, 385--394.
[134]
Matteo Riondato, David García-Soriano, and Francesco Bonchi. 2014. Graph summarization with quality guarantees. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’14). IEEE, 947--952.
[135]
Ryan Rossi, Brian Gallagher, Jennifer Neville, and Keith Henderson. 2012. Role-dynamics: Fast mining of large dynamic networks. InProceedings of the 25th International Conference Companion on the World Wide Web (WWW’12 Companion). ACM, 997--1006.
[136]
T. Safavi, C. Sripada, and D. Koutra. 2017. Scalable hashing-based network discovery. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’17). 405--414.
[137]
Barna Saha and Pabitra Mitra. 2007. Dynamic algorithm for graph clustering using minimum cut tree. In Proceedings of the SIAM International Conference on Data Mining (SDM’07). 581--586.
[138]
Neil Shah, Danai Koutra, Lisa Jin, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. 2017. On summarizing large-scale dynamic graphs. IEEE Data Eng. Bull. 40, 3 (2017), 75--88.
[139]
Neil Shah, Danai Koutra, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. 2015. TimeCrunch: Interpretable dynamic graph summarization. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’15).
[140]
P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker. 2003. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 11 (2003), 2498.
[141]
Umang Sharan and Jennifer Neville. 2008. Temporal-relational classifiers for prediction in evolving domains. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’08). 540--549.
[142]
Z. Shen, K.-L. Ma, and T. Eliassi-Rad. 2006. Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Vis. Comput. Graph. 12, 6 (2006), 1427--1439.
[143]
Lei Shi, Hanghang Tong, Jie Tang, and Chuang Lin. 2015. VEGAS: Visual influence graph summarization on citation networks. IEEE Trans. Knowl. Data Eng. 27, 12 (2015), 3417--3431.
[144]
Ben Shneiderman. 2008. Extreme visualization: Squeezing a billion records into a million pixels. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08).
[145]
Mahsa Shoaran, Alex Thomo, and Jens H. Weber-Jahnke. 2013. Zero-knowledge private graph summarization. In Proceedings of the IEEE International Conference on Big Data. IEEE, 597--605.
[146]
Koen Smets and Jilles Vreeken. 2011. The odd one out: Identifying and characterising anomalies. In Proceedings of the SIAM International Conference on Data Mining (SDM’11). 804--815.
[147]
Qi Song, Yinhui Wu, and Xin Luna Dong. 2016. Mining summaries for knowledge graph search. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’16). 1215--1220.
[148]
Sucheta Soundarajan, Acar Tamersoy, Elias B. Khalil, Tina Eliassi-Rad, Duen Horng Chau, Brian Gallagher, and Kevin Roundy. 2016. Generating graph snapshots from streaming edge data. In Proceedings of the 25th International Conference Companion on the World Wide Web. 109--110.
[149]
Daniel A. Spielman and Nikhil Srivastava. 2011. Graph sparsification by effective resistances. SIAM J. Comput. 40, 6 (2011), 1913--1926.
[150]
Olaf Sporns. 2010. Networks of the Brain. MIT Press, Cambridge, MA.
[151]
Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu. 2007. GraphScope: Parameter-free mining of large time-evolving graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’07). ACM, 687--696.
[152]
Jimeng Sun, Charalampos E. Tsourakakis, Evan Hoke, Christos Faloutsos, and Tina Eliassi-Rad. 2008. Two heads better than one: Pattern discovery in time-evolving multi-aspect data. Data Min. Knowl. Discov. 17, 1 (2008), 111--128.
[153]
Jimeng Sun, Yinglian Xie, Hui Zhang, and Christos Faloutsos. 2007. Less is more: Compact matrix decomposition for large sparse graphs. In Proceedings of the SIAM International Conference on Data Mining (SDM’07).
[154]
Yizhou Sun and Jiawei Han. 2012. Mining Heterogeneous Information Networks: Principles and Methodologies. Morgan 8 Claypool.
[155]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). 1067--1077.
[156]
Nan Tang, Qing Chen, and Prasenjit Mitra. 2016. Graph stream summarization: From big bang to big crunch. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1481--1496.
[157]
Chayant Tantipathananandh and Tanya Berger-Wolf. 2011. Finding communities in dynamic social networks. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’11). IEEE, 1236--1241.
[158]
Lini T. Thomas, Satyanarayana R. Valluri, and Kamalakar Karlapalem. 2010. MARGIN: Maximal frequent subgraph mining. ACM Trans. Knowl. Discov. Data 4, 3 (2010), 10:1--10:42.
[159]
Yuanyuan Tian, Richard A. Hankins, and Jignesh M. Patel. 2008. Efficient aggregation for graph summarization. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08). ACM, 567--580.
[160]
Yuanyuan Tian and Jignesh M. Patel. 2008. TALE: A tool for approximate large graph matching. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering. 963--972.
[161]
Hannu Toivonen, Fang Zhou, Aleksi Hartikainen, and Atte Hinkka. 2011. Compression of weighted graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 965--973.
[162]
Hanghang Tong, Christos Faloutsos, Brian Gallagher, and Tina Eliassi-Rad. 2007. Fast best-effort pattern matching in large attributed graphs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 737--746.
[163]
Jilles Vreeken, Matthijs van Leeuwen, and Arno Siebes. 2011. Krimp: Mining itemsets that compress. Data Min. Knowl. Disc. 23, 1 (2011), 169--214.
[164]
Bianca Wackersreuther, Peter Wackersreuther, Annahita Oswald, Christian Böhm, and Karsten M. Borgwardt. 2010. Frequent subgraph discovery in dynamic networks. In Proceedings of the 8th Workshop on Mining and Learning with Graphs. ACM, 155--162.
[165]
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16).
[166]
Jianyong Wang and George Karypis. 2004. SUMMARY: Efficiently summarizing transactions for clustering. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’04). 241--248.
[167]
S. Wasserman and J. Galaskiewicz. 1994. Advances in Social Network Analysis: Research in the Social and Behavioral Sciences. SAGE Publications.
[168]
Ye Wu, Zhinong Zhong, Wei Xiong, and Ning Jing. 2014. Graph summarization for attributed graphs. In Proceedings of the International Conference on Information Science, Electronics, and Electrical Engineering (ISEEE’14). IEEE, 503--507.
[169]
Yang Xiang, Ruoming Jin, David Fuhry, and Feodor Dragan. 2010. Summarizing transactional databases with overlapped hyperrectangles. Data Min. Knowl. Disc. 23, 2, 215--251.
[170]
Kevin S. Xu, Mark Kliger, and Alfred O. Hero III. 2011. Tracking communities in dynamic social networks. In Proceedings of the 4th International Conference on Social Computing, Behavioral-Cultural Modeling, 8 Prediction (SBP’11). 219--226.
[171]
Zhiqiang Xu, Yiping Ke, Yi Wang, Hong Cheng, and James Cheng. 2012. A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). ACM, 505--516.
[172]
Xifeng Yan, Hong Cheng, Jiawei Han, and Dong Xin. 2005. Summarizing itemset patterns: A profile-based approach. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). 314--323.
[173]
Jaewon Yang and Jure Leskovec. 2013. Overlapping community detection at scale: A nonnegative matrix factorization approach. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM’13). ACM, 587--596.
[174]
Jaewon Yang, Julian McAuley, and Jure Leskovec. 2013. Community detection in networks with node attributes. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’13). IEEE, 1151--1156.
[175]
Liu Yang, Susan T. Dumais, Paul N. Bennett, and Ahmed Hassan Awadallah. 2017. Characterizing and predicting enterprise email reply behavior. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17). ACM, 235--244.
[176]
Jinguo You, Qiuping Pan, Wei Shi, Zhipeng Zhang, and Jianhua Hu. 2013. Towards graph summary and aggregation: A survey. In Social Media Retrieval and Mining. Springer, 3--12.
[177]
Ning Zhang, Yuanyuan Tian, and Jignesh M. Patel. 2010. Discovery-driven graph summarization. In Proceedings of the 2003 IEEE International Conference on Data Engineering (ICDE’10). 880--891.
[178]
Peixiang Zhao, Charu C. Aggarwal, and Min Wang. 2011. gSketch: On query estimation in graph streams. Proc. VLDB Endow. 5, 3 (2011), 193--204.
[179]
Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. 2009. Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2, 1 (2009), 718--729.
[180]
Linhong Zhu, Majid Ghasemi-Gol, Pedro Szekely, Aram Galstyan, and Craig A. Knoblock. 2016. Unsupervised entity resolution on multi-type graphs. In Proceedings of the International Semantic Web Conference. 649--667.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 51, Issue 3
May 2019
796 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3212709
  • Editor:
  • Sartaj Sahni
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2018
Accepted: 01 February 2018
Revised: 01 January 2018
Received: 01 December 2016
Published in CSUR Volume 51, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph mining
  2. graph summarization

Qualifiers

  • Survey
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,591
  • Downloads (Last 6 weeks)143
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Vessel Trajectory Data Mining: A ReviewIEEE Access10.1109/ACCESS.2025.352595213(4827-4856)Online publication date: 2025
  • (2025)MARA: A deep learning based framework for multilayer graph simplificationNeurocomputing10.1016/j.neucom.2024.128712612(128712)Online publication date: Jan-2025
  • (2025)Finding meaningful paths in heterogeneous graphs with PathWaysInformation Systems10.1016/j.is.2024.102463127:COnline publication date: 1-Jan-2025
  • (2025)Mining structure overlaps for efficient graph compressionInternational Journal of Data Science and Analytics10.1007/s41060-024-00711-wOnline publication date: 4-Jan-2025
  • (2025)Analyzing the Impact of Coarsening on k-Partite Network ClassificationIntelligent Systems10.1007/978-3-031-79029-4_11(156-168)Online publication date: 30-Jan-2025
  • (2024)A survey on extractive knowledge graph summarizationProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/916(8290-8298)Online publication date: 3-Aug-2024
  • (2024)Compression-based inference of network motif setsPLOS Computational Biology10.1371/journal.pcbi.101246020:10(e1012460)Online publication date: 10-Oct-2024
  • (2024)Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model AggregationACM Transactions on Knowledge Discovery from Data10.1145/370156319:1(1-26)Online publication date: 28-Dec-2024
  • (2024)Accurate Sampling-Based Cardinality Estimation for Complex Graph QueriesACM Transactions on Database Systems10.1145/3689209Online publication date: 17-Aug-2024
  • (2024)Graph Summarization: Compactness Meets EfficiencyProceedings of the ACM on Management of Data10.1145/36549432:3(1-26)Online publication date: 30-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media