Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

TG-OUT: temporal outlier patterns detection in Twitter attribute induced graphs

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Given a network of Twitter users, can we capture their posting behavior over time, identify patterns that could probably describe, model or predict their activity? Can we identify temporal connectivity patterns that emerge from the use of specific attributes? More challengingly, are there particular attribute usage patterns which indicate an inherent anomaly? This work provides solid answers to all these questions, extending previous work employed on other social networks and attribute types. We propose TG-OUT, a pipeline of methods which : (a) model the temporal evolution of attribute induced graphs to detect peculiar attributes, (b) identify temporal patterns in attribute distributions, (c) investigate differences in patterns emerging from bot and/or non-bot accounts, (d) extract tailored sets of exploitable features. Experimental results show that: most of the individual attribute distributions remain stable over time following mostly power laws norm; the temporal evolution of attribute induced graphs obey certain laws and deviations are outliers; we discover that patterns present deviations which depend on the type of accounts which use each attribute; finally, we show that careful selection of only two features which are used to train a simple machine learning algorithm, produces a model which efficiently identifies attributes mainly used by bots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://help.twitter.com/en/using-twitter/twitter-follow-limit

References

  1. Akoglu, L, Chandy, R, Faloutsos, C: Opinion fraud detection in online reviews by network effects. In: ICWSM. The AAAI Press (2013)

  2. Akoglu, L, McGlohon, M, Faloutsos, C: oddball: Spotting anomalies in weighted graphs. In: PAKDD (2), Lecture Notes in Computer Science, vol 6119. Springer, pp 410–421 (2010)

  3. Akoglu, L, Tong, H, Koutra, D: Graph based anomaly detection and description: a survey. Data Mining Knowl Discov 29(3), 626–688 (2015)

    Article  MathSciNet  Google Scholar 

  4. Barabasi, AL: The origin of bursts and heavy tails in human dynamics. Nature 435(7039), 207–211 (2005)

    Article  Google Scholar 

  5. Batista, GE, Prati, RC, Monard, MC: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newslett 6(1), 20–29 (2004)

    Article  Google Scholar 

  6. Chakrabarti, D, Faloutsos, C: Graph mining: laws, generators, and algorithms. ACM Comput Surv (CSUR) 38(1), 2 (2006)

    Article  Google Scholar 

  7. Chatzakou, D, Kourtellis, N, Blackburn, J, De Cristofaro, E, Stringhini, G, Vakali, A: Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference, pp 13–22 (2017)

  8. Chavoshi, N, Hamooni, H, Mueen, A: Debot: Twitter bot detection via warped correlation. In: ICDM, pp 817–822 (2016)

  9. Chino, DY, Costa, AF, Traina, AJ, Faloutsos, C: Voltime: unsupervised anomaly detection on users’ online activity volume. In: Proceedings of the 2017 SIAM international conference on data mining. SIAM, pp 108–116 (2017)

  10. Clauset, A, Shalizi, CR, Newman, ME: Power-law distributions in empirical data. SIAM Rev 51(4), 661–703 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. Crandall, D, Cosley, D, Huttenlocher, D, Kleinberg, J, Suri, S: Feedback effects between similarity and social influence in online communities. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 160–168 (2008)

  12. Cresci, S: A decade of social bot detection. Commun ACM 63(10), 72–83 (2020)

    Article  Google Scholar 

  13. Davis, C, Varol, O, Ferrara, E, Flammini, A, Menczer, F: Botornot: a system to evaluate social bots. In: Proceedings of the 25th International conference companion on World Wide Web. International World Wide Web Conferences Steering Committee, pp 273–274 (2016)

  14. De Choudhury, M, Counts, S, Horvitz, E: Major life changes and behavioral markers in social media: case of childbirth. In: Proceedings of the 2013 conference on computer supported cooperative work, pp 1431–1442 (2013)

  15. De Choudhury, M, Counts, S, Horvitz, EJ, Hoff, A: Characterizing and predicting postpartum depression from shared facebook data. In: Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pp 626–638 (2014)

  16. De Melo, POV, Akoglu, L, Faloutsos, C, Loureiro, AA: Surprising patterns for the call duration distribution of mobile phone users. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 354–369 (2010)

  17. Devineni, P, Koutra, D, Faloutsos, M, Faloutsos, C: If walls could talk: patterns and anomalies in facebook wallposts. In: ASONAM. ACM, pp 367–374 (2015)

  18. Eswaran, D, Rabbany, R, Dubrawski, AW, Faloutsos, C: Social-affiliation networks: patterns and the soar model

  19. Faloutsos, M, Faloutsos, P, Faloutsos, C: On power-law relationships of the internet topology. ACM SIGCOMM Comput Commun Rev 29(4), 251–262 (1999)

    Article  MATH  Google Scholar 

  20. Ferrara, E, Varol, O, Davis, C, Menczer, F, Flammini, A: The rise of social bots. Commun ACM 59(7), 96–104 (2016)

    Article  Google Scholar 

  21. Ghosh, S, Viswanath, B, Kooti, F, Sharma, NK, Korlam, G, Benevenuto, F, Ganguly, N, Gummadi, KP: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 61–70 (2012)

  22. Giatsoglou, M, Chatzakou, D, Shah, N, Faloutsos, C, Vakali, A: Retweeting activity on twitter: signs of deception. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 122–134 (2015)

  23. Gong, NZ, Talwalkar, A, Mackey, L, Huang, L, Shin, ECR, Stefanov, E, Shi, E, Song, D: Joint link prediction and attribute inference using a social-attribute network. ACM Trans Intell Syst Technol (TIST) 5(2), 1–20 (2014)

    Article  Google Scholar 

  24. Guo, L, Tan, E, Chen, S, Zhang, X, Zhao, Y: Analyzing patterns of user content generation in online social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 369–378 (2009)

  25. He, H, Bai, Y, Garcia, EA, Li, S: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence). IEEE, pp 1322–1328 (2008)

  26. Hooi, B, Shah, N, Beutel, A, Günnemann, S, Akoglu, L, Kumar, M, Makhija, D, Faloutsos, C: Birdnest: Bayesian inference for ratings-fraud detection. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 495–503 (2016)

  27. Hooi, B, Song, HA, Beutel, A, Shah, N, Shin, K, Faloutsos, C: FRAUDAR: bounding graph fraud in the face of camouflage. In: KDD. ACM, pp 895–904 (2016)

  28. Jane Lytvynenko, RM: General data protection regulation - right to explanation https://www.buzzfeednews.com/article/janelytvynenko/twitter-cryptocurrency-scams-verified-accounts-russia-target (2018)

  29. Clement, J: Number of Twitter users worldwide from 2014 to 2020 (accessed Jul 23, 2019). https://www.statista.com/statistics/303681/twitter-users-worldwide/ (2019)

  30. Jiang, M, Cui, P, Beutel, A, Faloutsos, C, Yang, S: Catchsync: catching synchronized behavior in large directed graphs. In: KDD. ACM, pp 941–950 (2014)

  31. Kim, M, Leskovec, J: Modeling social networks with node attributes using the multiplicative attribute graph model. arXiv:1106.5053 (2011)

  32. Kim, M, Leskovec, J: Multiplicative attribute graph model of real-world networks. Internet Math 8(1-2), 113–160 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  33. Koutra, D, Koutras, V, Prakash, BA, Faloutsos, C: Patterns amongst competing task frequencies: super-linearities, and the almond-dg model. In: PAKDD (1), Lecture notes in computer science, vol 7818. Springer, pp 201–212 (2013)

  34. La Fond, T, Neville, J: Randomization tests for distinguishing social influence and homophily effects. In: Proceedings of the 19th international conference on World wide web. ACM, pp 601–610 (2010)

  35. Leskovec, J, Chakrabarti, D, Kleinberg, J, Faloutsos, C, Ghahramani, Z: Kronecker graphs: an approach to modeling networks. J Mach Learn Res 11(Feb), 985–1042 (2010)

    MathSciNet  MATH  Google Scholar 

  36. Leskovec, J, Kleinberg, J, Faloutsos, C: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp 177–187 (2005)

  37. Leskovec, J, Kleinberg, JM, Faloutsos, C: Graphs over time: densification laws, shrinking diameters and possible explanations. In: KDD. ACM, pp 177–187 (2005)

  38. Lokot, T, Diakopoulos, N: News bots: automating news and information dissemination on twitter. Digit J 4(6), 682–699 (2016)

    Google Scholar 

  39. Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V, Thirion, B, Grisel, O, Blondel, M, Prettenhofer, P, Weiss, R, Dubourg, V, et al: Scikit-learn: machine learning in python. J Mach Learn Res 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  40. Perozzi, B, Akoglu, L: Scalable anomaly ranking of attributed neighborhoods. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 207–215 (2016)

  41. Pfeiffer, JJ III, Moreno, S, La Fond, T, Neville, J, Gallagher, B: Attributed graph models: modeling network structure with correlated attributes. In: Proceedings of the 23rd international conference on World Wide Web. ACM, pp 831–842 (2014)

  42. Pillutla, VK, Fang, Z, Devineni, P, Faloutsos, C, Koutra, D, Tang, J: On skewed multi-dimensional distributions: the fusionrp model, algorithms, and discoveries. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 783–791 (2016)

  43. Rauchfleisch, A, Kaiser, J: The false positive problem of automatic bot detection in social science research. Berkman Klein Center Research Publication (2020-3) (2020)

  44. Sayyadiharikandeh, M, Varol, O, Yang, KC, Flammini, A, Menczer, F: Detection of novel social bots by ensembles of specialized classifiers. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 2725–2732 (2020)

  45. Seshadri, M, Machiraju, S, Sridharan, A, Bolot, J, Faloutsos, C, Leskove, J: Mobile call graphs: beyond power-law and lognormal distributions. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 596–604 (2008)

  46. Shah, N, Beutel, A, Gallagher, B, Faloutsos, C: Spotting suspicious link behavior with fbox: an adversarial perspective. arXiv:abs/1410.3915(2014)

  47. Thomas, K, McCoy, D, Grier, C, Kolcz, A, Paxson, V: Trafficking fraudulent accounts: the role of the underground market in twitter spam and abuse. In: USENIX Security Symposium, pp 195–210 (2013)

  48. Tsourakakis, CE: Fast counting of triangles in large real networks without counting: algorithms and laws. In: 2008 Eighth IEEE international conference on data mining. IEEE, pp 608–617 (2008)

  49. Varol, O, Ferrara, E, Davis, C, Menczer, F, Flammini, A: Online human-bot interactions: detection, estimation, and characterization. In: Eleventh international AAAI conference on Web and social media (2017)

  50. Wang, B, Zubiaga, A, Liakata, M, Procter, R: Making the most of tweet-inherent features for social spam detection on twitter. arXiv:1503.074051503.07405 (2015)

  51. Wang, Y, Liu, J, Qu, J, Huang, Y, Chen, J, Feng, X: Hashtag graph based topic model for tweet mining. In: 2014 IEEE International conference on data mining. IEEE, pp 1025–1030 (2014)

  52. Yang, KC, Varol, O, Davis, C, Ferrara, E, Flammini, A, Menczer, F: Arming the public with artificial intelligence to counter social bots. Human Behav Emerg Technol 1(1), 48–61 (2019)

    Article  Google Scholar 

  53. Yang, KC, Varol, O, Hui, PM, Menczer, F: Scalable and generalizable social bot detection through data selection. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 1096–1103 (2020)

  54. Zhang, CM, Paxson, V: Detecting and analyzing automated activity on twitter. In: International conference on passive and active network measurement. Springer, pp 102–111 (2011)

  55. Zhang, X: A very gentle note on the construction of Dirichlet process. The Australian National University Canberra (2008)

  56. Zheleva, E, Sharara, H, Getoor, L: Co-evolution of social and affiliation networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1007–1016 (2009)

Download references

Acknowledgements

This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH — CREATE — INNOVATE (Project Code: T1EDK-03052), as well as from the H2020 Research and Innovation Programme under Grant Agreement No.875329.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilias Dimitriadis.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Computational Aspects of Network Science Guest Editors: Apostolos N. Papadopoulos and Richard Chbeir

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dimitriadis, I., Poiitis, M., Faloutsos, C. et al. TG-OUT: temporal outlier patterns detection in Twitter attribute induced graphs. World Wide Web 25, 2429–2453 (2022). https://doi.org/10.1007/s11280-021-00986-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00986-0

Keywords

Navigation