Computer Science > Social and Information Networks

arXiv:2004.00216 (cs)

[Submitted on 1 Apr 2020 (v1), last revised 17 Dec 2020 (this version, v3)]

Title:Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Authors:Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, Jiawei Han

View PDF

Abstract:Since real-world objects and their interactions are often multi-modal and multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, and generic superclass of traditional homogeneous networks (graphs). Meanwhile, representation learning (\aka~embedding) has recently been intensively studied and shown effective for various network mining and analytical tasks. In this work, we aim to provide a unified framework to deeply summarize and evaluate existing research on heterogeneous network embedding (HNE), which includes but goes beyond a normal survey. Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms. Moreover, existing HNE algorithms, though mostly claimed generic, are often evaluated on different datasets. Understandable due to the application favor of HNE, such indirect comparisons largely hinder the proper attribution of improved task performance towards effective data preprocessing and novel technical design, especially considering the various ways possible to construct a heterogeneous network from real-world application data. Therefore, as the second contribution, we create four benchmark datasets with various properties regarding scale, structure, attribute/label availability, and \etc.~from different sources, towards handy and fair evaluations of HNE algorithms. As the third contribution, we carefully refactor and amend the implementations and create friendly interfaces for 13 popular HNE algorithms, and provide all-around comparisons among them over multiple tasks and experimental settings.

Comments:	Accepted by IEEE TKDE. All code and data available at this https URL
Subjects:	Social and Information Networks (cs.SI); Machine Learning (cs.LG)
Cite as:	arXiv:2004.00216 [cs.SI]
	(or arXiv:2004.00216v3 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.2004.00216

Submission history

From: Carl Yang [view email]
[v1] Wed, 1 Apr 2020 03:42:11 UTC (1,891 KB)
[v2] Mon, 15 Jun 2020 19:07:41 UTC (2,051 KB)
[v3] Thu, 17 Dec 2020 01:44:03 UTC (3,664 KB)

Computer Science > Social and Information Networks

Title:Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Social and Information Networks

Title:Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators