Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Bots in Social and Interaction Networks: Detection and Impact Estimation

Published: 17 October 2020 Publication History

Abstract

The rise of bots and their influence on social networks is a hot topic that has aroused the interest of many researchers. Despite the efforts to detect social bots, it is still difficult to distinguish them from legitimate users. Here, we propose a simple yet effective semi-supervised method that allows distinguishing between bots and legitimate users with high accuracy. The method learns a joint representation of social connections and interactions between users by leveraging graph-based representation learning. Then, on the proximity graph derived from user embeddings, a sample of bots is used as seeds for a label propagation algorithm. We demonstrate that when the label propagation is done according to pairwise account proximity, our method achieves F1 = 0.93, whereas other state-of-the-art techniques achieve F1 ≤ 0.87. By applying our method to a large dataset of retweets, we uncover the presence of different clusters of bots in the network of Twitter interactions. Interestingly, such clusters feature different degrees of integration with legitimate users. By analyzing the interactions produced by the different clusters of bots, our results suggest that a significant group of users was systematically exposed to content produced by bots and to interactions with bots, indicating the presence of a selective exposure phenomenon.

References

[1]
Sinan Aral and Dean Eckles. 2019. Protecting elections from social media manipulation. Science 365, 6456 (2019), 858--861.
[2]
Chris Baraniuk. 2018. How Twitter bots help fuel political feuds. Sci. Amer. (2018), 20--30. https://www.scientificamerican.com/article/how-twitter-bots-help-fuel-political-feuds/.
[3]
David A. Broniatowski, Amelia M. Jamison, SiHua Qi, Lulwah AlKulaib, Tao Chen, Adrian Benton, Sandra C. Quinn, and Mark Dredze. 2018. Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Amer. J. Pub. Health 108, 10 (2018), 1378--1384.
[4]
Carlos Castillo, Marcelo Mendoza, and Bárbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference Companion on World Wide Web (WWW’11). ACM, 675--684.
[5]
Carlos Castillo, Marcelo Mendoza, and Bárbara Poblete. 2013. Predicting information credibility in time-sensitive social media. Internet Res. 23, 5 (2013), 560--588.
[6]
Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2012. Detecting automation of Twitter accounts: Are you a human, bot, or cyborg? IEEE Trans. Depend. Sec. Comput. 9, 6 (2012), 811--824.
[7]
Matteo Cinelli, Stefano Cresci, Alessandro Galeazzi, Walter Quattrociocchi, and Maurizio Tesconi. 2020. The limited reach of fake news on Twitter during 2019 European elections. PLoS ONE 15, 6 (2020), e0234689.
[8]
Eric M. Clark, Chris A. Jones, Jake Ryland Williams, Allison N. Kurti, Mitchell Craig Norotsky, Christopher M. Danforth, and Peter Sheridan Dodds. 2016. Vaporous marketing: Uncovering pervasive electronic cigarette advertisements on Twitter. PLoS One 11, 7 (2016).
[9]
Aaron Clauset, M. E. J. Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Phys. Rev. E 70, 6 (2004).
[10]
Stefano Cresci. 2020. A decade of social bot detection. Commun. ACM 63, 10 (2020), 72--83.
[11]
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2015. Fame for sale: Efficient detection of fake Twitter followers. Dec. Supp. Syst. 80 (2015), 56--71.
[12]
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. Social fingerprinting: Detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Depend. Sec. Comput. 15, 4 (2017), 561--576.
[13]
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2017. The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. In Proceedings of the 26th International Conference Companion on World Wide Web (WWW’17). 963--972.
[14]
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2020. Emergent properties, models, and laws of behavioral similarities within groups of Twitter users. Comput. Commun. 150 (2020), 47--61.
[15]
Stefano Cresci, Fabrizio Lillo, Daniele Regoli, Serena Tardelli, and Maurizio Tesconi. 2018. $FAKE: Evidence of spam and bot activity in stock microblogs on Twitter. In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM’18). AAAI, 580--583.
[16]
Stefano Cresci, Fabrizio Lillo, Daniele Regoli, Serena Tardelli, and Maurizio Tesconi. 2019. Cashtag piggybacking: Uncovering spam and bot activity in stock microblogs on Twitter. ACM Trans. Web 13, 2 (2019), 11.
[17]
Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi. 2019. Better safe than sorry: An adversarial approach to improve social bot detection. In Proceedings of the 11th International ACM Web Science Conference (WebSci’19). ACM, 47--56.
[18]
Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi. 2019. On the capability of evolved spambots to evade detection via genetic engineering. Online Social Netw. Media 9 (2019), 1--16.
[19]
Giovanni Da San Martino, Stefano Cresci, Alberto Barrón-Cedeño, Seunghak Yu, Roberto Di Pietro, and Preslav Nakov. 2020. A survey on computational propaganda detection. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20).
[20]
Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2016. BotOrNot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW’16). 273--274.
[21]
Pedro M. Domingos. 2012. A few useful things to know about machine learning.Commun. ACM 55, 10 (2012), 78--87.
[22]
Mohd Fazil and Muhammad Abulaish. 2020. A socialbots analysis-driven graph-based approach for identifying coordinated campaigns in Twitter. J. Intell. Fuzzy Syst. Preprint (2020), 1--17.
[23]
Emilio Ferrara. 2020. #COVID-19 on Twitter: Bots, conspiracies, and social media activism. arXiv preprint arXiv:2004.09531 (2020).
[24]
Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM 59, 7 (2016), 96--104.
[25]
Emilio Ferrara, Onur Varol, Filippo Menczer, and Alessandro Flammini. 2016. Detection of promoted social media campaigns. In Proceedings of the 10th International AAAI Conference on Web and Social Media (ICWSM’16). AAAI, 563--566.
[26]
Syeda Nadia Firdaus, Chen Ding, and Alireza Sadeghian. 2018. Retweet: A popular information diffusion mechanism--A survey paper. Online Social Netw. Media 6 (2018), 26--40.
[27]
Riccardo Gallotti, Francesco Valle, Nicola Castaldo, Pierluigi Sacco, and Manlio De Domenico. 2020. Assessing the risks of “infodemics” in response to COVID-19 epidemics. arXiv preprint arXiv:2004.03997 (2020).
[28]
Zafar Gilani, Reza Farahbakhsh, Gareth Tyson, and Jon Crowcroft. 2019. A large-scale behavioural analysis of bots and humans on Twitter. ACM Trans. Web 13, 1 (2019), 7.
[29]
Sharad Goel, Duncan J. Watts, and Daniel G. Goldstein. 2012. The structure of online diffusion networks. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC’12). ACM, 623--638.
[30]
Huy Hang, Xuetao Wei, Michalis Faloutsos, and Tina Eliassi-Rad. 2013. Entelecheia: Detecting P2P botnets in their waiting stage. In Proceedings of the 12th IFIP Networking Conference. IEEE, 1--9.
[31]
Philip N. Howard. 2018. How political campaigns weaponize social media bots. IEEE Spectrum 55, 11 (2018).
[32]
Ville Hyvönen, Teemu Pitkänen, Sotiris K. Tasoulis, Elias Jaasaari, Risto Tuomainen, Liang Wang, Jukka Corander, and Teemu Roos. 2016. Fast nearest neighbor search through sparse random projections and voting. In Proceedings of the 3rd IEEE International Conference on Big Data (BigData’16). IEEE, 881--888.
[33]
Elias Jääsaari, Ville Hyvönen, and Teemu Roos. 2019. Efficient autotuning of hyperparameters in approximate nearest neighbor search. In Proceedings of the 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’19). 590--602.
[34]
Bence Kollanyi, Philip N. Howard, and Samuel C. Woolley. 2016. Bots and automation over Twitter during the first U.S. election. Data Memo 2016.4. Oxford, UK: Project on Computational Propaganda (2016).
[35]
Kyumin Lee, James Caverlee, and Steve Webb. 2010. Uncovering social spammers: Social honeypots + machine learning. In Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). ACM, 435--442.
[36]
Kyumin Lee, Brian David Eoff, and James Caverlee. 2011. Seven months with the devils: A long-term study of content polluters on Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11). AAAI.
[37]
Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alexander Peysakhovich. 2019. PyTorch-BigGraph: A large-scale graph embedding system. In Proceedings of the 2nd Conference on Systems and Machine Learning (SysML’19).
[38]
Shing-Han Li, Yu-Cheng Kao, Zong-Cyuan Zhang, Ying-Ping Chuang, and David C. Yen. 2015. A network behavior-based botnet detection mechanism using PSO and k-means. ACM Trans. Manag. Inf. Syst. 6, 1 (2015), 3.
[39]
Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. Holoscope: Topology-and-spike aware fraud detection. In Proceedings of the 26th ACM Conference on Information and Knowledge Management (CIKM’17). ACM, 1539--1548.
[40]
Luca Luceri, Ashok Deb, Silvia Giordano, and Emilio Ferrara. 2019. Evolution of bot and human behavior during elections. First Mond. 24, 9 (2019).
[41]
Michele Mazza, Stefano Cresci, Marco Avvenuti, Walter Quattrociocchi, and Maurizio Tesconi. 2019. RTbust: Exploiting temporal patterns for botnet detection on Twitter. In Proceedings of the 11th International ACM Web Science Conference (WebSci’19). ACM, 183--192.
[42]
Stuart E. Middleton and Vadims Krivcovs. 2016. Geoparsing and geosemantics for social media: Spatiotemporal grounding of content propagating rumors to support trust and veracity analysis during breaking news. ACM Trans. Inf. Syst. 34, 3 (2016), 16.
[43]
Claude Nadeau and Yoshua Bengio. 2003. Inference for the generalization error. Mach. Learn. 52, 3 (2003), 239--281.
[44]
Leonardo Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, and Maurizio Tesconi. 2020. Coordinated behavior on social media in 2019 UK general election. arXiv preprint arXiv:2008.08370 (2020).
[45]
Leonardo Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, Maurizio Tesconi, and Emilio Ferrara. 2020. Charting the landscape of online cryptocurrency manipulation. IEEE Access 8 (2020), 113230--113245.
[46]
Symeon Papadopoulos, Kalina Bontcheva, Eva Jaho, Mihai Lupu, and Carlos Castillo. 2016. Overview of the special issue on trust and veracity of information in social media. ACM Trans. Inf. Syst. 34, 3 (2016), 14.
[47]
Eli Pariser. 2011. The Filter Bubble: What the Internet is Hiding from You. Penguin UK.
[48]
David Martin Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness 8 correlation. J. Mach. Learn. Technol. 2, 1 (2011), 37--63.
[49]
Michael Reinhard. 2020. Automating fandom: Social bots, music celebrity, and identity online. Transform. Works Cult. 32 (2020).
[50]
Marian-Andrei Rizoiu, Timothy Graham, Rui Zhang, Yifei Zhang, Robert Ackland, and Lexing Xie. 2018. #DebateNight: The role and influence of socialbots on Twitter during the 1st 2016 US presidential debate. In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM’18). AAAI.
[51]
Yu Rong, Qiankun Zhu, and Hong Cheng. 2016. A model-free approach to infer the diffusion network from event cascade. In Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM’16). ACM, 1653--1662.
[52]
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The graph neural network model. IEEE Trans. Neural Netw. 20, 1 (2009), 61--80.
[53]
Ross Schuchard, Andrew T. Crooks, Anthony Stefanidis, and Arie Croitoru. 2019. Bot stamina: Examining the influence and staying power of bots in online social networks. Appl. Netw. Sci. 4, 1 (2019), 55.
[54]
Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Kai-Cheng Yang, Alessandro Flammini, and Filippo Menczer. 2018. The spread of low-credibility content by social bots. Nat. Commun. 9, 4787 (2018).
[55]
Chengcheng Shao, Pik Mai Hui, Lei Wang, Xinwen Jiang, Alessandro Flammini, Filippo Menczer, and Giovanni Luca Ciampaglia. 2018. Anatomy of an online misinformation network. PLoS ONE 13, 4 (2018), e0196087.
[56]
Kate Starbird. 2019. Disinformation’s spread: Bots, trolls and all of us. Nature 571, 7766 (2019), 449.
[57]
Kate Starbird, Ahmer Arif, and Tom Wilson. 2019. Disinformation as collaborative work: Surfacing the participatory nature of strategic information operations. In Proceedings of the 22nd ACM Conference on Computer Supported Cooperative Work 8 Social Computing (CSCW’19). ACM.
[58]
Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM’17). AAAI.
[59]
Onur Varol, Emilio Ferrara, Filippo Menczer, and Alessandro Flammini. 2017. Early detection of promoted campaigns on social media. EPJ Data Sci. 6, 1 (2017), 13.
[60]
Soroush Vosoughi, Mostafa‘‘Neo’’ Mohsenvand, and Deb Roy. 2017. Rumor gauge: Predicting the veracity of rumors on Twitter. ACM Trans. Knowl. Discov. Data 11, 4 (2017), 50.
[61]
Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146--1151.
[62]
Lilian Weng, Filippo Menczer, and Yong Yeol Ahn. 2013. Virality prediction and community structure in social networks. Sci. Rep. 3, 2522 (2013).
[63]
Han Xiao, Cigdem Aslay, and Aristides Gionis. 2018. Robust cascade reconstruction by Steiner tree sampling. In Proceedings of the 18th International Conference on Data Mining (ICDM’18). IEEE, 637--646.
[64]
Kai-Cheng Yang, Onur Varol, Clayton A. Davis, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2019. Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 1, 1 (2019), 48--61.
[65]
Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, and Filippo Menczer. 2020. Scalable and generalizable social bot detection through data selection. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI’20). AAAI.
[66]
Chengxi Zang, Peng Cui, Chaoming Song, Christos Faloutsos, and Wenwu Zhu. 2017. Quantifying structural patterns of information cascades. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW’17 Companion). 867--868.
[67]
Jinxue Zhang, Rui Zhang, Yanchao Zhang, and Guanhua Yan. 2018. The rise of social botnets: Attacks and countermeasures. IEEE Trans. Depend. Sec. Comput. 15, 6 (2018), 1068--1082.

Cited By

View all
  • (2024)Post-hoc Evaluation of Nodes Influence in Information Cascades: The Case of Coordinated AccountsACM Transactions on the Web10.1145/3700644Online publication date: 17-Oct-2024
  • (2024)A Survey on the Applications of Semi-supervised Learning to Cyber-securityACM Computing Surveys10.1145/365764756:10(1-41)Online publication date: 22-Jun-2024
  • (2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems
ACM Transactions on Information Systems  Volume 39, Issue 1
January 2021
329 pages
ISSN:1046-8188
EISSN:1558-2868
DOI:10.1145/3423044
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2020
Accepted: 01 August 2020
Revised: 01 May 2020
Received: 01 October 2019
Published in TOIS Volume 39, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Social bots
  2. disinformation
  3. label propagation
  4. selective exposure
  5. semi-supervised bot detection
  6. user embeddings

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)22
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Post-hoc Evaluation of Nodes Influence in Information Cascades: The Case of Coordinated AccountsACM Transactions on the Web10.1145/3700644Online publication date: 17-Oct-2024
  • (2024)A Survey on the Applications of Semi-supervised Learning to Cyber-securityACM Computing Surveys10.1145/365764756:10(1-41)Online publication date: 22-Jun-2024
  • (2024)CGNN: A Compatibility-Aware Graph Neural Network for Social Media Bot DetectionIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.339641311:5(6528-6543)Online publication date: Oct-2024
  • (2024)Coordinated Behavior in Information Operations on TwitterIEEE Access10.1109/ACCESS.2024.339348212(61568-61585)Online publication date: 2024
  • (2024)Detecting of Robotic Imitation of Human on-the-Website Activity With Advanced Vector Analysis and Fractional DerivativesIEEE Access10.1109/ACCESS.2024.339137712(56707-56718)Online publication date: 2024
  • (2024)Temporal dynamics of coordinated online behavior: Stability, archetypes, and influenceProceedings of the National Academy of Sciences10.1073/pnas.2307038121121:20Online publication date: 6-May-2024
  • (2024)Coarse-to-fine label propagation with hybrid representation for deep semi-supervised bot detectionWireless Networks10.1007/s11276-024-03821-2Online publication date: 14-Aug-2024
  • (2023)A Study on Information Disorders on Social Networks during the Chilean Social Outbreak and COVID-19 PandemicApplied Sciences10.3390/app1309534713:9(5347)Online publication date: 25-Apr-2023
  • (2023)Do you hear the people sing? Comparison of synchronized URL and narrative themes in 2020 and 2023 French protestsFrontiers in Big Data10.3389/fdata.2023.12217446Online publication date: 24-Aug-2023
  • (2023)A Calibration Model For Bot-Like Behaviors In Agent-Based Anagram Game Simulation2023 Winter Simulation Conference (WSC)10.1109/WSC60868.2023.10408394(221-232)Online publication date: 10-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media