research-article

Open access

Detecting Social Bot on the Fly using Contrastive Learning

Authors:

Yangli-Ao Geng,

Jie TangAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 4995 - 5001

https://doi.org/10.1145/3583780.3615468

Published: 21 October 2023 Publication History

Abstract

Social bot detection is becoming a task of wide concern in social security. All along, the development of social bot detection technology is hindered by the lack of high-quality annotated data. Besides, the rapid development of AI Generated Content (AIGC) technology is dramatically improving the creative ability of social bots. For example, the recently released ChatGPT [2] can fool the state-of-the-art AI-text-detection method with a probability of 74%, bringing a large challenge to content-based bot detection methods. To address the above drawbacks, we propose a Contrastive Learning-driven Social Bot Detection framework (CBD). The core of CBD is characterized by a two-stage model learning strategy: a contrastive pre-training stage to mine generalization patterns from massive unlabeled social graphs, followed by a semi-supervised fine-tuning stage to model task-specific knowledge latent in social graphs with a few annotations. The above strategy endows our model with promising detection performance under an extreme scarcity of labeled data. In terms of system architecture, we propose a smart feedback mechanism to further improve detection performance. Comprehensive experiments on a real bot detection dataset show that CBD consistently outperforms 10 state-of-the-art baselines by a large margin for few-shot bot detection using very little (5-shot) labeled data. CBD has been deployed online.

References

[1]

Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, and Aram Galstyan. 2019. Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In international conference on machine learning. PMLR, 21--29.

[2]

Open AI. 2022. ChatGPT: Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/, Accessed: 2023-02--25.

[3]

Open AI. 2023. New AI classifier for indicating AI-written text. https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/, Accessed: 2023-02--25.

[4]

Seyed Ali Alhosseini, Raad Bin Tareaf, Pejman Najafi, and Christoph Meinel. 2019. Detect me if you can: Spam bot detection using inductive representation learning. In Companion Proceedings of The 2019 World Wide Web Conference. 148--153.

Digital Library

[5]

Bo Chen, Jing Zhang, Xiaokang Zhang, Yuxiao Dong, Jian Song, Peng Zhang, Kaibo Xu, Evgeny Kharlamov, and Jie Tang. 2022. GCCAD: Graph Contrastive Learning for Anomaly Detection. IEEE Transactions on Knowledge and Data Engineering (2022).

[6]

Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2020. Adaptive universal generalized pagerank graph neural network. arXiv preprint arXiv:2006.07988 (2020).

[7]

Stefano Cresci. 2020. A decade of social bot detection. Commun. ACM, Vol. 63, 10 (2020), 72--83.

Digital Library

[8]

Mohd Fazil, Amit Kumar Sah, and Muhammad Abulaish. 2021. Deepsbd: a deep neural network model with attention mechanism for socialbot detection. IEEE Transactions on Information Forensics and Security, Vol. 16 (2021), 4211--4223.

Digital Library

[9]

Shangbin Feng, Zhaoxuan Tan, Rui Li, and Minnan Luo. 2021. Heterogeneity-aware Twitter Bot Detection with Relational Graph Transformers. arXiv preprint arXiv:2109.02927 (2021).

[10]

Shangbin Feng, Zhaoxuan Tan, Herun Wan, Ningnan Wang, Zilong Chen, Binchi Zhang, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, et al. 2022. TwiBot-22: Towards Graph-Based Twitter Bot Detection. arXiv preprint arXiv:2206.04564 (2022).

[11]

Emilio Ferrara. 2017. Disinformation and social bot operations in the run up to the 2017 French presidential election. arXiv preprint arXiv:1707.00086 (2017).

[12]

Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.

[13]

Qinglang Guo, Haiyong Xie, Yangyang Li, Wen Ma, and Chao Zhang. 2021. Social Bots Detection via Fusing BERT and Graph Convolutional Networks. Symmetry, Vol. 14, 1 (2021), 30.

[14]

Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International Conference on Machine Learning. PMLR, 4116--4126.

[15]

Maryam Heidari, H James Jr, and Ozlem Uzuner. 2021. An empirical study of machine learning algorithms for social media bot detection. In 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). IEEE, 1--5.

[16]

Maryam Heidari, James H Jones, and Ozlem Uzuner. 2020. Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter. In 2020 International Conference on Data Mining Workshops (ICDMW). IEEE, 480--487.

[17]

Maryam Heidari and James H Jones Jr. 2022. Bert Model for Social Media Bot Detection. (2022).

[18]

R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).

[19]

Andrzej Jarynowski. 2022. Conflicts driven pandemic and war issues in Social Media via multi-layer approach of German Twitter. (2022).

[20]

Yizhu Jiao, Yun Xiong, Jiawei Zhang, Yao Zhang, Tianqi Zhang, and Yangyong Zhu. 2020. Sub-graph contrast for scalable self-supervised graph representation learning. In 2020 IEEE international conference on data mining (ICDM). IEEE, 222--231.

[21]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[22]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[23]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 (2018).

[24]

Sneha Kudugunta and Emilio Ferrara. 2018. Deep neural networks for bot detection. Information Sciences, Vol. 467 (2018), 312--322.

[25]

Tingting Li, Ziming Zeng, Shouqiang Sun, and Jingjing Sun. 2022b. A novel integrated framework based on multi-view features for multidimensional social bot detection. Journal of Information Science (2022), 01655515221116517.

[26]

Wenbin Li, Xiaokai Chu, Yueyang Su, Di Yao, Shiwei Zhao, Runze Wu, Shize Zhang, Jianrong Tao, Hao Deng, and Jingping Bi. 2022a. FingFormer: Contrastive Graph-based Finger Operation Transformer for Unsupervised Mobile Game Bot Detection. In Proceedings of the ACM Web Conference 2022. 3367--3375.

Digital Library

[27]

Yangyang Li, Yipeng Ji, Shaoning Li, Shulong He, Yinhao Cao, Yifeng Liu, Hong Liu, Xiong Li, Jun Shi, and Yangchao Yang. 2021. Relevance-aware anomalous users detection in social network via graph neural network. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--8.

[28]

Derek Lim, Felix Hohne, Xiuyu Li, Sijia Linda Huang, Vaishnavi Gupta, Omkar Bhalerao, and Ser Nam Lim. 2021. Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods. Advances in Neural Information Processing Systems, Vol. 34 (2021).

[29]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).

[30]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).

[31]

Pandu Gumelar Pratama and Nur Aini Rakhmawati. 2019. Social bot detection on 2019 Indonesia president candidate's supporter's tweets. Procedia Computer Science, Vol. 161 (2019), 813--820.

Digital Library

[32]

Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1150--1160.

Digital Library

[33]

Chiara Ravazzi, Francesco Malandrino, and Fabrizio Dabbene. 2022. Towards Proactive Moderation of Malicious Content via Bot Detection in Fringe Social Networks. IEEE Control Systems Letters (2022).

[34]

Mohsen Sayyadiharikandeh, Onur Varol, Kai-Cheng Yang, Alessandro Flammini, and Filippo Menczer. 2020. Detection of novel social bots by ensembles of specialized classifiers. In Proceedings of the 29th ACM international conference on information & knowledge management. 2725--2732.

Digital Library

[35]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European semantic web conference. Springer, 593--607.

[36]

Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Alessandro Flammini, and Filippo Menczer. 2017. The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, Vol. 96 (2017), 104.

[37]

Peining Shi, Zhiyong Zhang, and Kim-Kwang Raymond Choo. 2019. Detecting malicious social bots based on clickstream sequences. IEEE Access, Vol. 7 (2019), 28855--28862.

[38]

Wen Shi, Diyi Liu, Jing Yang, Jing Zhang, Sanmei Wen, and Jing Su. 2020. Social bots' sentiment engagement in health emergencies: A topic-based analysis of the covid-19 pandemic discussions on twitter. International Journal of Environmental Research and Public Health, Vol. 17, 22 (2020), 8701.

[39]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, Vol. 15, 1 (2014), 1929--1958.

[40]

New York Times. 2022. Musk Says Twitter Committed Fraud in Dispute Over Fake Accounts. https://www.nytimes.com/2022/08/04/technology/musk-twitter-fraud.html, Accessed: 2023-02--26.

[41]

Iraklis Varlamis, Dimitrios Michail, Foteini Glykou, and Panagiotis Tsantilas. 2022. A Survey on the Use of Graph Convolutional Networks for Combating Fake News. Future Internet, Vol. 14, 3 (2022), 70.

[42]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[43]

Wikipedia. 2023. Social bot. https://en.wikipedia.org/wiki/Social_bot, Accessed: 2023-05--30.

[44]

Samuel C Woolley. 2016. Automating power: Social bot interference in global politics. First Monday (2016).

[45]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018a. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).

[46]

Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. 2018b. Representation learning on graphs with jumping knowledge networks. In International Conference on Machine Learning. PMLR, 5453--5462.

[47]

Kai-Cheng Yang, Pik-Mai Hui, and Filippo Menczer. 2019. Bot electioneering volume: Visualizing social bot activity during elections. In Companion Proceedings of The 2019 World Wide Web Conference. 214--217.

Digital Library

[48]

Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, and Filippo Menczer. 2020. Scalable and generalizable social bot detection through data selection. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 1096--1103.

[49]

Menghan Zhang, Xue Qi, Ze Chen, and Jun Liu. 2022. Social Bots' Involvement in the COVID-19 Vaccine Discussions on Twitter. International Journal of Environmental Research and Public Health, Vol. 19, 3 (2022), 1651.

[50]

Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond homophily in graph neural networks: Current limitations and effective designs. Advances in Neural Information Processing Systems, Vol. 33 (2020), 7793--7804.

Cited By

Zhang DZheng SZhu YYuan HGong JTang J(2024)MCAP: Low-Pass GNNs with Matrix Completion for Academic RecommendationsACM Transactions on Information Systems10.1145/369819343:2(1-29)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1145/3698193
Huang HTian HZheng XZhang XZeng DWang F(2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
https://doi.org/10.1109/TCSS.2024.3396413
Huang HZhao M(2024)GMAE2: Stacking Graph Masked Autoencoder on Feature Autoencoder for Social Bot DetectionProceedings of 2024 12th China Conference on Command and Control10.1007/978-981-97-7774-7_26(285-297)Online publication date: 27-Dec-2024
https://doi.org/10.1007/978-981-97-7774-7_26

Index Terms

Detecting Social Bot on the Fly using Contrastive Learning

Recommendations

Multi-modal Social Bot Detection: Learning Homophilic and Heterophilic Connections Adaptively
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

The detection of social bots has become a critical task in maintaining the integrity of social media. With social bots evolving continually, they primarily evade detection by imitating human features and engaging in interactions with humans. To reduce ...
SEBot: Structural Entropy Guided Multi-View Contrastive learning for Social Bot Detection
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-...
BotSCL: Heterophily-Aware Social Bot Detection with Supervised Contrastive Learning
Pattern Recognition
Abstract
Detecting social bots, which continuously evolve, presents an escalating challenge. Although graph-based detection techniques utilize various relationships within social networks to model node behavior, they often fail to account for inherent ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

China Postdoctoral Science Foundation
Natural Science Foundation of China
National Key R&D Program of China

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
1,107
Total Downloads

Downloads (Last 12 months)798
Downloads (Last 6 weeks)51

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang DZheng SZhu YYuan HGong JTang J(2024)MCAP: Low-Pass GNNs with Matrix Completion for Academic RecommendationsACM Transactions on Information Systems10.1145/369819343:2(1-29)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1145/3698193
Huang HTian HZheng XZhang XZeng DWang F(2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
https://doi.org/10.1109/TCSS.2024.3396413
Huang HZhao M(2024)GMAE2: Stacking Graph Masked Autoencoder on Feature Autoencoder for Social Bot DetectionProceedings of 2024 12th China Conference on Command and Control10.1007/978-981-97-7774-7_26(285-297)Online publication date: 27-Dec-2024
https://doi.org/10.1007/978-981-97-7774-7_26

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten