Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3477495.3531968acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Distilling Knowledge on Text Graph for Social Media Attribute Inference

Published: 07 July 2022 Publication History

Abstract

The popularization of social media generates a large amount of user-oriented data, where text data especially attracts researchers and speculators to infer user attributes (e.g., age, gender) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks for higher-level text representations. However, these text graphs are constructed on words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for social media attribute inferences. Our model builds a text graph with texts as nodes and edges learned from current text representations via manifold learning and message passing. To further use unlabeled texts to improve few-shot performance, a knowledge distillation is devised to optimize the problem. This offers a trade-off between expressiveness and complexity. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.

Supplementary Material

MP4 File (SIGIR2022_sp1432.mp4)
This is the presentation video for the paper ''Distilling Knowledge on Text Graph for Social Media Attribute Inference.'' The video indicates our motivation, describes our model in detail, and briefly analyzes our results.

References

[1]
Lingwei Chen, Xiaoting Li, and Dinghao Wu. 2020. Enhancing robustness of graph convolutional networks via dropping graph connections. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 412--428.
[2]
Weijian Chen, Yulong Gu, Zhaochun Ren, Xiangnan He, Hongtao Xie, Tong Guo, Dawei Yin, and Yongdong Zhang. 2019. Semi-supervised User Profiling with Heterogeneous Graph Attention Networks. In IJCAI, Vol. 19. 2116--2122.
[3]
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixedlength context. arXiv preprint arXiv:1901.02860 (2019).
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]
[5]
Kaize Ding, Jianling Wang, Jundong Li, Dingcheng Li, and Huan Liu. 2020. Be more with less: Hypergraph attention networks for inductive text classification. arXiv preprint arXiv:2011.00387 (2020).
[6]
George Forman. 2008. BNS feature scaling: an improved representation over tf-idf for svm text classification. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). 263--270.
[7]
Victor Garcia and Joan Bruna. 2017. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043 (2017).
[8]
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International conference on machine learning. 1263--1272.
[9]
Neil Zhenqiang Gong and Bin Liu. 2018. Attribute inference attacks in online social networks. ACM Transactions on Privacy and Security (TOPS) 21, 1 (2018), 1--30.
[10]
Alex Graves. 2012. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks. 37--45.
[11]
Geoffrey Hinton, Oriol Vinyals, Jeff Dean, et al. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2, 7 (2015).
[12]
Lianzhe Huang, Dehong Ma, Sujian Li, Xiaodong Zhang, and Houfeng Wang. 2019. Text level graph neural network for text classification. arXiv preprint arXiv:1910.02356 (2019).
[13]
Jinyuan Jia and Neil Zhenqiang Gong. 2018. Attriguard: A practical defense against attribute inference attacks via adversarial machine learning. In 27th USENIX Security Symposium (USENIX Security 18). 513--529.
[14]
Jinyuan Jia, Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. Attriinfer: Inferring user attributes in online social networks using markov random fields. In Proceedings of the 26th International Conference on World Wide Web. 1561--1569.
[15]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[16]
Xiaoting Li, Lingwei Chen, and Dinghao Wu. 2021. Turning Attacks into Protection: Social Media Privacy Protection Using Adversarial Attacks. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM). SIAM, 208--216.
[17]
Chung-Ying Lin. 2020. Social reaction toward the 2019 novel coronavirus (COVID19). Social Health and Behavior 3, 1 (2020), 1.
[18]
Hu Linmei, Tianchi Yang, Chuan Shi, Houye Ji, and Xiaoli Li. 2019. Heterogeneous graph attention networks for semi-supervised short text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 4821--4830.
[19]
Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. 2018. Learning to propagate labels: Transductive propagation network for few-shot learning. arXiv preprint arXiv:1805.10002 (2018).
[20]
Abduallah Mohamed, Kun Qian, Mohamed Elhoseiny, and Christian Claudel. 2020. Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction. In CVPR. 14424--14432.
[21]
Gordon Pennycook, Jonathon McPhetres, Yunhao Zhang, Jackson G Lu, and David G Rand. 2020. Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention. Psychological Science (2020).
[22]
Jay M Ponte and W Bruce Croft. 2017. A language modeling approach to information retrieval. In ACM SIGIR Forum, Vol. 51. ACM New York, NY, USA, 202--208.
[23]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
[24]
Jonathan Schler, Moshe Koppel, Shlomo Argamon, and James W Pennebaker. 2006. Effects of age and gender on blogging. In AAAI spring symposium: Computational approaches to analyzing weblogs, Vol. 6. 199--205.
[25]
Amit Singhal et al. 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24, 4 (2001), 35--43.
[26]
Yaqing Wang, Song Wang, Quanming Yao, and Dejing Dou. 2021. Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification. arXiv preprint arXiv:2111.00180 (2021).
[27]
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7370--7377.
[28]
Yanfang Ye, Shifu Hou, Yujie Fan, Yiyue Qian, Yiming Zhang, Shiyu Sun, Qian Peng, and Kenneth Laparo. 2020. ??-Satellite: An AI-driven System and Benchmark Datasets for Hierarchical Community-level Risk Assessment to Help Combat COVID-19. arXiv preprint arXiv:2003.12232 (2020).
[29]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In SIGKDD. 974--983.
[30]
Sixie Yu, Yevgeniy Vorobeychik, and Scott Alfeld. 2018. Adversarial classification on social networks. In International Conference on Autonomous Agents and MultiAgent Systems. 211--219.
[31]
Wen Zhang, Taketoshi Yoshida, and Xijin Tang. 2011. A comparative study of TFIDF, LSI and multi-words for text classification. Expert Systems with Applications 38, 3 (2011), 2758--2765.
[32]
Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, and Liang Wang. 2020. Every document owns its structure: Inductive text classification via graph neural networks. arXiv preprint arXiv:2004.13826 (2020).

Cited By

View all
  • (2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
  • (2024)Exploring the Potential of Large Language Models (LLMs)in Learning on GraphsACM SIGKDD Explorations Newsletter10.1145/3655103.365511025:2(42-61)Online publication date: 28-Mar-2024
  • (2024)DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On Graphs2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650494(1-8)Online publication date: 30-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2022
3569 pages
ISBN:9781450387323
DOI:10.1145/3477495
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attribute inference
  2. few-shot learning
  3. graph neural networks
  4. knowledge distillation
  5. social media

Qualifiers

  • Short-paper

Conference

SIGIR '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)8
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fairness Testing of Machine Translation SystemsACM Transactions on Software Engineering and Methodology10.1145/366460833:6(1-27)Online publication date: 27-Jun-2024
  • (2024)Exploring the Potential of Large Language Models (LLMs)in Learning on GraphsACM SIGKDD Explorations Newsletter10.1145/3655103.365511025:2(42-61)Online publication date: 28-Mar-2024
  • (2024)DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On Graphs2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650494(1-8)Online publication date: 30-Jun-2024
  • (2024)Leveraging Homophily-Augmented Energy Propagation for Bot Detection on GraphsDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_5(68-83)Online publication date: 31-Aug-2024
  • (2024)H$$^2$$GNN: Graph Neural Networks with Homophilic and Heterophilic Feature AggregationsDatabase Systems for Advanced Applications10.1007/978-981-97-5572-1_23(342-352)Online publication date: 31-Aug-2024
  • (2023)Adversary for Social Good: Leveraging Adversarial Attacks to Protect Personal Attribute PrivacyACM Transactions on Knowledge Discovery from Data10.1145/361409818:2(1-24)Online publication date: 13-Nov-2023
  • (2023)HOVER: Homophilic Oversampling via Edge Removal for Class-Imbalanced Bot Detection on GraphsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615264(3728-3732)Online publication date: 21-Oct-2023
  • (2023)Pseudo-Labeling with Graph Active Learning for Few-shot Node Classification2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00133(1115-1120)Online publication date: 1-Dec-2023
  • (2023)SEML: Self-Supervised Information-Enhanced Meta-learning for Few-Shot Text ClassificationInternational Journal of Computational Intelligence Systems10.1007/s44196-023-00287-616:1Online publication date: 1-Jul-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media