short-paper

Open access

HyperFormer: Learning Expressive Sparse Feature Representations via Hypergraph Transformer

Authors:

Albert Jiongqian Liang,

Derek Zhiyuan ChengAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2062 - 2066

https://doi.org/10.1145/3539618.3591999

Published: 18 July 2023 Publication History

Abstract

Learning expressive representations for high-dimensional yet sparse features has been a longstanding problem in information retrieval. Though recent deep learning methods can partially solve the problem, they often fail to handle the numerous sparse features, particularly those tail feature values with infrequent occurrences in the training data. Worse still, existing methods cannot explicitly leverage the correlations among different instances to help further improve the representation learning on sparse features since such relational prior knowledge is not provided. To address these challenges, in this paper, we tackle the problem of representation learning on feature-sparse data from a graph learning perspective. Specifically, we propose to model the sparse features of different instances using hypergraphs where each node represents a data instance and each hyperedge denotes a distinct feature value. By passing messages on the constructed hypergraphs based on our Hypergraph Transformer (HyperFormer), the learned feature representations capture not only the correlations among different instances but also the correlations among features. Our experiments demonstrate that the proposed approach can effectively improve feature representation learning on sparse features.

References

[1]

Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi. 2017. Understanding of a convolutional neural network. In ICET.

[2]

Mathieu Blondel, Akinori Fujino, Naonori Ueda, and Masakazu Ishihata. 2016. Higher-order factorization machines. NeurIPS (2016).

[3]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In RecSys.

[4]

Kaize Ding, Yichuan Li, Jundong Li, Chenghao Liu, and Huan Liu. 2019. Feature interaction-aware graph neural networks. arXiv preprint arXiv:1908.07110 (2019).

[5]

Kaize Ding, Jianling Wang, James Caverlee, and Huan Liu. 2022. Meta Propagation Networks for Few-shot Semi-supervised Learning on Graphs. In AAAI.

[6]

Kaize Ding, Jianling Wang, Jundong Li, Dingcheng Li, and Huan Liu. 2020. Be More with Less: Hypergraph Attention Networks for Inductive Text Classification. In EMNLP.

[7]

Kaize Ding, Yancheng Wang, Yingzhen Yang, and Huan Liu. 2023. Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning. In AAAI.

[8]

Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. 2019. Hypergraph neural networks. In AAAI.

[9]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.

Digital Library

[10]

Wei Guo, Rong Su, Renhao Tan, Huifeng Guo, Yingxue Zhang, Zhirong Liu, Ruiming Tang, and Xiuqiang He. 2021. Dual graph enhanced embedding neural network for CTR prediction. In KDD.

[11]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS.

[12]

Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In SIGIR.

[13]

Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware factorization machines for CTR prediction. In RecSys.

[14]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[15]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then propagate: Graph neural networks meet personalized pagerank. In ICLR.

[16]

Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, and Liang Wang. 2019. Fi-gnn: Modeling feature interactions via graph neural networks for ctr prediction. In CIKM.

Digital Library

[17]

Zekun Li, Shu Wu, Zeyu Cui, and Xiaoyu Zhang. 2021. GraphFM: Graph factorization machines for feature interaction modeling. arXiv preprint arXiv:2105.11866 (2021).

[18]

Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In KDD.

[19]

Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature generation by convolutional neural network for click-through rate prediction. In The World Wide Web Conference. 1119--1129.

Digital Library

[20]

Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In SIGIR.

[21]

Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, and Quan Lu. 2018. Field-weighted factorization machines for click-through rate prediction in display advertising. In TheWebConf.

[22]

Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In ICDM.

[23]

Steffen Rendle. 2010. Factorization machines. In ICDM.

[24]

Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 255--262.

Digital Library

[25]

Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. Autoint: Automatic feature interaction learning via self-attentive neural networks. In CIKM.

Digital Library

[26]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. NeurIPS (2017).

[27]

Petar Velicković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In ICLR.

[28]

Jianling Wang, Kaize Ding, Liangjie Hong, Huan Liu, and James Caverlee. 2020. Next-item Recommendation with Sequential Hypergraphs. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[29]

Jianling Wang, Kaize Ding, Ziwei Zhu, and James Caverlee. 2021a. Session-based recommendation with hypergraph attention networks. In SDM.

[30]

Jinpeng Wang, Jieming Zhu, and Xiuqiang He. 2021c. Cross-batch negative sampling for training two-tower recommenders. In SIGIR.

[31]

Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In KDD.

[32]

Ruoxi Wang, Rakesh Shivanna, Derek Cheng, Sagar Jain, Dong Lin, Lichan Hong, and Ed Chi. 2021b. Dcn v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In TheWebConf.

Digital Library

[33]

Tiansheng Yao, Xinyang Yi, Derek Zhiyuan Cheng, Felix Yu, Ting Chen, Aditya Menon, Lichan Hong, Ed H Chi, Steve Tjoa, Jieqi Kang, et al. 2021. Self-supervised learning for large-scale item recommendations. In CIKM.

[34]

Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, and Ed Chi. 2019. Sampling-bias-corrected neural modeling for large corpus item recommendations. In RecSys.

[35]

Yantao Yu, Weipeng Wang, Zhoutian Feng, and Daiyue Xue. 2021. A dual augmented two-tower model for online large-scale recommendation. DLP-KDD (2021).

[36]

Yin Zhang, Derek Zhiyuan Cheng, Tiansheng Yao, Xinyang Yi, Lichan Hong, and Ed H Chi. 2021. A model of two tales: Dual transfer learning framework for improved long-tail item recommendation. In TheWebConf.

[37]

Zuowu Zheng, Changwang Zhang, Xiaofeng Gao, and Guihai Chen. 2022. HIEN: hierarchical intention embedding network for click-through rate prediction. In SIGIR.

[38]

Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. NeurIPS (2006).

[39]

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In KDD.

[40]

Hao Zhu, Man-Di Luo, Rui Wang, Ai-Hua Zheng, and Ran He. 2021. Deep audio-visual learning: A survey. International Journal of Automation and Computing, Vol. 18 (2021), 351--376.

Digital Library

[41]

Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In WWW.

Cited By

Shen JQian HLiu SZhang WJiang BZhou ABaeza-Yates RBonchi F(2024)Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education SystemsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672002(2628-2639)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672002
Huang CYu FWan ZLi FJi HLi Y(2024)Knowledge graph confidence-aware embedding for recommendationNeural Networks10.1016/j.neunet.2024.106601(106601)Online publication date: Aug-2024
https://doi.org/10.1016/j.neunet.2024.106601
Wu JYu TWang RSong ZZhang RZhao HLu CLi SHenao ROh ANaumann TGloberson ASaenko KHardt MLevine S(2023)InfoPromptProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668791(61060-61084)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668791
Show More Cited By

Index Terms

HyperFormer: Learning Expressive Sparse Feature Representations via Hypergraph Transformer
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval

Recommendations

Hypergraph regularized sparse feature learning

As an important pre-processing stage in many machine learning and pattern recognition domains, feature selection deems to identify the most discriminate features for a compact data representation. As typical feature selection methods, Lasso and its ...
Dictionary learning for unsupervised feature selection via dual sparse regression
Abstract
With unlabeled and high-dimensional data explosion, unsupervised feature selection has become an essential step in many machine learning and data mining tasks. Many dictionary learning based models have been successfully developed for unsupervised ...
Feature self-representation based hypergraph unsupervised feature selection via low-rank representation

Dimension reduction methods always catch many attentions, because it could effectively solve the curse of dimensionality problem. In this paper, we propose an unsupervised feature selection method which could efficiently select a subset of informative ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Short-paper

Funding Sources

NSF (National Science Foundation)

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
567
Total Downloads

Downloads (Last 12 months)478
Downloads (Last 6 weeks)72

Reflects downloads up to 19 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shen JQian HLiu SZhang WJiang BZhou ABaeza-Yates RBonchi F(2024)Capturing Homogeneous Influence among Students: Hypergraph Cognitive Diagnosis for Intelligent Education SystemsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672002(2628-2639)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672002
Huang CYu FWan ZLi FJi HLi Y(2024)Knowledge graph confidence-aware embedding for recommendationNeural Networks10.1016/j.neunet.2024.106601(106601)Online publication date: Aug-2024
https://doi.org/10.1016/j.neunet.2024.106601
Wu JYu TWang RSong ZZhang RZhao HLu CLi SHenao ROh ANaumann TGloberson ASaenko KHardt MLevine S(2023)InfoPromptProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668791(61060-61084)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668791
Cheng DWang RKang WColeman BZhang YNi JValverde JHong LChi E(2023)Efficient Data Representation Learning in Google-scale SystemsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608882(267-271)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608882

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents