research-article

Detecting spammers on social networks

Authors:

Xianghan Zheng,

Chunming RongAuthors Info & Claims

Neurocomputing, Volume 159, Issue C

Pages 27 - 34

https://doi.org/10.1016/j.neucom.2015.02.047

Published: 02 July 2015 Publication History

Abstract

Social network has become a very popular way for internet users to communicate and interact online. Users spend plenty of time on famous social networks (e.g., Facebook, Twitter, Sina Weibo, etc.), reading news, discussing events and posting messages. Unfortunately, this popularity also attracts a significant amount of spammers who continuously expose malicious behavior (e.g., post messages containing commercial URLs, following a larger amount of users, etc.), leading to great misunderstanding and inconvenience on users social activities. In this paper, a supervised machine learning based solution is proposed for an effective spammer detection. The main procedure of the work is: first, collect a dataset from Sina Weibo including 30,116 users and more than 16 million messages. Then, construct a labeled dataset of users and manually classify users into spammers and non-spammers. Afterwards, extract a set of feature from message content and users social behavior, and apply into SVM (Support Vector Machines) based spammer detection algorithm. The experiment shows that the proposed solution is capable to provide excellent performance with true positive rate of spammers and non-spammers reaching 99.1% and 99.9% respectively.

References

[1]

Facebook, {http://www.facebook.com/}

[2]

Welcome to Twitter, {http://twitter.com/}.

[3]

Weibo - SINA, {http://english.sina.com/weibo/}.

[4]

Statista, {http://www.statista.com/}.

[5]

Nexgate. 2013 State of Social Media Spam, {http://nexgate.com/wp-content/uploads/2013/09/Nexgate-2013-State-of-Social-Media-Spam-Research-Report.pdf}, 2013.

[6]

Weibocrawler, {http://weibocrawler.sourceforge.net/}.

[7]

Alexa Top 500 Global Sites, {http://www.alexa.com/topsites}.

[8]

M. Uemura, T. Tabata, Design and evaluation of a Bayesian-filter-based image spam filtering method, in: Proceedings of the International Conference on Information Security and Assurance (ISA), IEEE, 2008, pp. 46-51.

Digital Library

[9]

B. Zhou, Y. Yao, J. Luo, Cost-sensitive three-way email spam filtering, J. Intell. Inf. Syst., 42 (2013) 19-45.

Digital Library

[10]

J. Jung, E. Sit, An empirical study of spam traffic and the use of DNS black Lists, in: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, ACM, 2004, pp. 370-375.

Digital Library

[11]

M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, N. Feamster, Building a dynamic reputation system for DNS, in: Proceedings of the Third USENIX Workshop on Large-scale Exploits and Emergent Threats (LEET), 2010.

[12]

Trust evaluation based content filtering in social interactive data, in: Proceedings of the 2013 International Conference on Cloud Computing and Big Data (CloudCom-Asia), IEEE, 2013, pp. 538-542.

[13]

J. Kincaird, Edgerank: the secret sauce that makes Facebook's news feed tick, TechCrunch, 2010, {http://techcrunch.com/2010/04/22/facebook-edgeran}.

[14]

S. Yardi, D. Romero, G. Schoenebeck, Detecting spam in a Twitter network, First Monday, 15 (2009).

[15]

G. Stringhini, C. Kruegel, G. Vigna, Detecting spammers on social networks, in: Proceedings of the 26th Annual Computer Security Applications Conference, ACM, 2010, pp. 1-9.

[16]

A.H. Wang, Don¿t follow me: spam detection in Twitter, Security and Cryptography (SECRYPT), in: Proceedings of the 2010 International Conference on. IEEE, 2010, pp. 1-10.

[17]

H. Gao, Y. Chen, K. Lee, D. Palsetia, A. Choudhary, Towards online spam filtering in social networks, in: Proceedings of the Symposium on Network and Distributed System Security (NDSS), 2012.

[18]

F. Benevenuto, G. Magno, T. Rodrigues, V. Almeida, Detecting spammers on Twitter, in: Proceedings of the Seventh Annual Collaboration, Electronic messaging, Anti-abuse and Spam Conference (CEAS), 2010.

[19]

Y. Zhu, X. Wang, E. Zhong, N.N. Liu, H. Li, Q. Yang, Discovering spammers in social networks, in: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI), 2012.

[20]

X. Hu, J. Tang, Y. Zhang, H. Liu, Social spammer detection in microblogging, in: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, ACM, 2013, pp. 2633-2639.

Digital Library

[21]

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The WEKA data mining software: an update, ACM SIGKDD Explor. Newsl., 11 (2009) 10-18.

Digital Library

[22]

F. Wang, C. Zhang, Robust self-tuning semi-supervised learning, Neurocomputing, 70 (2007) 2931-2939.

Digital Library

[23]

C. Cortes, V. Vapnik, Support-vector networks, Mach. learn., 20 (1995) 273-297.

[24]

LIBSVM - A Library for Support Vector Machines, {http://www.csie.ntu.edu.tw/~cjlin/libsvm/}

[25]

G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications, Neurocomputing, 70 (2006) 489-501.

[26]

G.-B. Huang, H. Zhou, R. Zhang, Extreme learning machine for regression and multiclass classification, IEEE Trans. Syst., Man, Cybern., 42 (2012) 513-529.

Digital Library

[27]

X. Zheng, N. Chen, Z. Chen, C. Rong, G. Chen, W. Guo, Mobile cloud based framework for remote-resident multimedia discovery and access, J. Internet Technol., 15 (2014) 1043-1050.

[28]

G.E. Hinton, Learning multiple layers of representation, Trends. Cogn. Sci., 11 (2007) 428-434.

[29]

Y. Bengio, Scaling up deep learning, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2014, p. 1966

Digital Library

[30]

S. Zhou, Q. Chen, X. Wang, Active deep learning method for semi-supervised sentiment classification, Neurocomputing, 120 (2013) 536-546.

Cited By

Lepipas ABorovykh ADemetriou SQuek TGao DZhou JCardenas A(2024)Username Squatting on Online Social Networks: A Study on XProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637637(621-637)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3637637
Rashid YBhat J(2024) OlapGNKnowledge-Based Systems10.1016/j.knosys.2023.111163283:COnline publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1016/j.knosys.2023.111163
Cheung MSun WShe JZhou J(2023)Social Network Analytic-Based Online Counterfeit Seller Detection using User Shared ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/352413519:1(1-18)Online publication date: 5-Jan-2023
https://dl.acm.org/doi/10.1145/3524135
Show More Cited By

Recommendations

Detecting spammers on social networks
ACSAC '10: Proceedings of the 26th Annual Computer Security Applications Conference

Social networking has become a popular way for users to meet and interact online. Users spend a significant amount of time on popular social network platforms (such as Facebook, MySpace, or Twitter), storing and sharing a wealth of personal information. ...
Spammer Detection on Weibo Social Network
CLOUDCOM '14: Proceedings of the 2014 IEEE 6th International Conference on Cloud Computing Technology and Science

Social network has become a very popular way for internet users to communicate and interact online. Users spend a great deal of time on famous social networks (e.g. Facebook, Twitter, Sina Weibo, etc.), reading news, discussing events and posting their ...
ELM-based spammer detection in social networks

Online social networks, such as Facebook, Twitter, and Weibo have played an important role in people's common life. Most existing social network platforms, however, face the challenges of dealing with undesirable users and their malicious spam ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 159, Issue C

July 2015

306 pages

ISSN:0925-2312

Issue’s Table of Contents

Copyright © The Authors.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 02 July 2015

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lepipas ABorovykh ADemetriou SQuek TGao DZhou JCardenas A(2024)Username Squatting on Online Social Networks: A Study on XProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637637(621-637)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3637637
Rashid YBhat J(2024) OlapGNKnowledge-Based Systems10.1016/j.knosys.2023.111163283:COnline publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1016/j.knosys.2023.111163
Cheung MSun WShe JZhou J(2023)Social Network Analytic-Based Online Counterfeit Seller Detection using User Shared ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/352413519:1(1-18)Online publication date: 5-Jan-2023
https://dl.acm.org/doi/10.1145/3524135
Kou JJia PLiu JDai JLuo H(2023)Identify influential nodes in social networks with graph multi-head attention regression modelNeurocomputing10.1016/j.neucom.2023.01.078530:C(23-36)Online publication date: 14-Apr-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.01.078
Yang ZChen XWang HWang WMiao ZJiang T(2022)A New Joint Approach with Temporal and Profile Information for Social Bot DetectionSecurity and Communication Networks10.1155/2022/91193882022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9119388
Cardoso ESilva RAlmeida T(2022)Towards automatic filtering of fake reviewsNeurocomputing10.1016/j.neucom.2018.04.074309:C(106-116)Online publication date: 21-Apr-2022
https://dl.acm.org/doi/10.1016/j.neucom.2018.04.074
Wani MJabin S(2022)Mutual clustering coefficient-based suspicious-link detection approach for online social networksJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2018.10.01434:2(218-231)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1016/j.jksuci.2018.10.014
Mbona IEloff J(2022)Feature selection using Benford’s law to support detection of malicious social media botsInformation Sciences: an International Journal10.1016/j.ins.2021.09.038582:C(369-381)Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1016/j.ins.2021.09.038
Fronzetti Colladon AGloor P(2022)Measuring the impact of spammers on e-mail and Twitter networksInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2018.09.00948:C(254-262)Online publication date: 21-Apr-2022
https://dl.acm.org/doi/10.1016/j.ijinfomgt.2018.09.009
T.K. BAnnavarapu CBablani A(2022)Machine learning algorithms for social media analysisComputer Science Review10.1016/j.cosrev.2021.10039540:COnline publication date: 6-May-2022
https://dl.acm.org/doi/10.1016/j.cosrev.2021.100395
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents