Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Automated Detection of Doxing on Twitter

Published: 11 November 2022 Publication History

Abstract

Doxing refers to the practice of disclosing sensitive personal information about a person without their consent. This form of cyberbullying is an unpleasant and sometimes dangerous phenomenon for online social networks. Although prior work exists on automated identification of other types of cyberbullying, a need exists for methods capable of detecting doxing on Twitter specifically. We propose and evaluate a set of approaches for automatically detecting second- and third-party disclosures on Twitter of sensitive private information, a subset of which constitutes doxing. We summarize our findings of common intentions behind doxing episodes and compare nine different approaches for automated detection based on string-matching and one-hot encoded heuristics, as well as word and contextualized string embedding representations of tweets. We identify an approach providing 96.86% accuracy and 97.37% recall using contextualized string embeddings and conclude by discussing the practicality of our proposed methods.

References

[1]
Sweta Agrawal and Amit Awekar. 2018. Deep learning for detecting cyberbullying across multiple social media platforms. In European Conference on Information Retrieval. Springer, 141--153.
[2]
Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. 2019. FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 54--59.
[3]
Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual String Embeddings for Sequence Labeling. In COLING 2018, 27th International Conference on Computational Linguistics. 1638--1649.
[4]
Monirah A Al-Ajlan and Mourad Ykhlef. 2018. Optimized Twitter cyberbullying detection based on deep learning. In 2018 21st Saudi Computer Society National Computer Conference (NCC). IEEE, 1--5.
[5]
Mohammed Ali Al-garadi, Kasturi Dewi Varathan, and Sri Devi Ravana. 2016. Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network. Computers in Human Behavior, Vol. 63 (2016), 433--443.
[6]
Hazim Almuhimedi, Shomir Wilson, Bin Liu, Norman Sadeh, and Alessandro Acquisti. 2013. Tweets are forever: a large-scale quantitative analysis of deleted tweets. In Proceedings of the 2013 conference on Computer supported cooperative work. 897--908.
[7]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. 759--760.
[8]
Vijay Banerjee, Jui Telavane, Pooja Gaikwad, and Pallavi Vartak. 2019. Detection of cyberbullying using deep neural network. In 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS). IEEE, 604--607.
[9]
Rajesh Basak, Shamik Sural, Niloy Ganguly, and Soumya K Ghosh. 2019. Online public shaming on twitter: Detection, analysis, and mitigation. IEEE Transactions on Computational Social Systems, Vol. 6, 2 (2019), 208--220.
[10]
Amy Bellmore, Angela J Calvin, Jun-Ming Xu, and Xiaojin Zhu. 2015. The five W's of "bullying" on Twitter: Who, what, why, where, and when. Computers in human behavior, Vol. 44 (2015), 305--314.
[11]
Aylin Caliskan Islam, Jonathan Walsh, and Rachel Greenstadt. 2014. Privacy detective: Detecting private information and collective privacy behavior in a large social network. In Proceedings of the 13th Workshop on Privacy in the Electronic Society. 35--46.
[12]
Gerardo Canfora, Andrea Di Sorbo, Enrico Emanuele, Sara Forootani, and Corrado A Visaggio. 2018. A nlp-based solution to prevent from privacy leaks in social network posts. In Proceedings of the 13th International Conference on Availability, Reliability and Security. 1--6.
[13]
Ko Ling Chan. 2019. Child victimization in the context of family violence.
[14]
Mengtong Chen, Anne Shann Yue Cheung, and Ko Ling Chan. 2019. Doxing: What adolescents look for and their intentions. International journal of environmental research and public health, Vol. 16, 2 (2019), 218.
[15]
Qiqi Chen, Ko Ling Chan, and Anne Shann Yue Cheung. 2018. Doxing victimization and emotional problems among secondary school students in Hong Kong. International journal of environmental research and public health, Vol. 15, 12 (2018), 2665.
[16]
Maral Dadvar and Kai Eckert. 2020. Cyberbullying detection in social networks using deep learning based models. In International Conference on Big Data Analytics and Knowledge Discovery. Springer, 245--255.
[17]
Elena Daehnhardt, Nick K Taylor, and Yanguo Jing. 2015. Usage and consequences of privacy settings in microblogs. In 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing. IEEE, 667--674.
[18]
Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Bryanna Hebenstreit, Md Mazharul Islam, Younes Karimi, Amir Masoumzadeh, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, and Bonnie J. Dorr. 2020. Active Defense Against Social Engineering: The Case for Human Language Technology. In Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management. European Language Resources Association, Marseille, France, 1--8. https://www.aclweb.org/anthology/2020.stoc-1.1
[19]
A Dalton, A Zemel, A Masoumzadeh, A Bhatia, B Dorr, B Mather, B Hebenstreit, E Al-Shaer, ECJ Ellisa Khoja, L Bunch, et al. 2019. Modeling social engineering risk using attitudes, actions, and intentions reflected in language use. In Proc. Thirty-Second International Florida Artificial Intelligence Research Society Conference, Sarasota, FL, USA, May 19--22 2019.
[20]
Leena Deodhar, Dinil Mon Divakaran, and Mohan Gurusamy. 2017. Analysis of Privacy Leak on Twitter. In GLOBECOM 2017--2017 IEEE Global Communications Conference. IEEE, 1--6.
[21]
Thomas G Dietterich. 1998. Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, Vol. 10, 7 (1998), 1895--1923.
[22]
David M Douglas. 2016. Doxing: a conceptual analysis. Ethics and information technology, Vol. 18, 3 (2016), 199--210.
[23]
Jim Edwards. 2017. FBI's `Gamergate' file says prosecutors didn't charge men who sent death threats to female video game fans - even when suspects confessed. https://www.businessinsider.com/gamergate-fbi-file-2017--2. [Online; accessed 28-September-2021].
[24]
Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin, Vol. 76, 5 (1971), 378.
[25]
Li Gao and James Stanyer. 2014. Hunting corrupt officials online: the human flesh search engine and the search for justice in China. Information, Communication & Society, Vol. 17, 7 (2014), 814--829.
[26]
R Geetha, S Karthika, and Ponnurangam Kumaraguru. 2020. "Will I Regret for This Tweet?'-Twitter User's Behavior Analysis System for Private Data Disclosure. Comput. J. (2020).
[27]
Kilem L Gwet. 2014. Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters. Advanced Analytics, LLC.
[28]
Jiawei Han, Micheline Kamber, and Jian Pei. 2012. 2 - Getting to Know Your Data. In Data Mining (Third Edition) third edition ed.), Jiawei Han, Micheline Kamber, and Jian Pei (Eds.). Morgan Kaufmann, Boston, 39--82. https://doi.org/10.1016/B978-0--12--381479--1.00002--2
[29]
Qianjia Huang, Vivek Kumar Singh, and Pradeep Kumar Atrey. 2014. Cyber bullying detection using social and textual analysis. In Proceedings of the 3rd International Workshop on Socially-Aware Multimedia. 3--6.
[30]
Andri Ioannou, Jeremy Blackburn, Gianluca Stringhini, Emiliano De Cristofaro, Nicolas Kourtellis, and Michael Sirivianos. 2018. From risk factors to detection and intervention: a practical proposal for future work on cyberbullying. Behaviour & Information Technology, Vol. 37, 3 (2018), 258--266.
[31]
Taraneh Khazaei, Lu Xiao, Robert E Mercer, and Atif Khan. 2016. Detecting privacy preferences from online social footprints: a literature review. IConference 2016 Proceedings (2016).
[32]
Helena C Kraemer. 2014. Kappa coefficient. Wiley StatsRef: Statistics Reference Online (2014), 1--4.
[33]
Akshi Kumar and Nitin Sachdeva. 2020. Multi-input integrative learning using deep neural networks and transfer learning for cyberbullying detection in real-time code-mix data. Multimedia systems (2020), 1--15.
[34]
J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.
[35]
Raquel Lozano-Blasco, Alejandra Cortés-Pascual, and Pilar Latorre-Mart'inez. 2020. Being a cybervictim and a cyberbully--The duality of cyberbullying: A meta-analysis. Computers in Human Behavior (2020), 106444.
[36]
David Mart'in-Gutiérrez, Gustavo Hernández-Pe naloza, Alberto Belmonte Hernández, Alicia Lozano-Diez, and Federico Álvarez. 2021. A Deep Learning Approach for Robust Detection of Bots in Twitter Using Transformers. IEEE Access, Vol. 9 (2021), 54591--54601.
[37]
Jasmine McNealy. 2018. What is doxxing, and why is it so scary? https://theconversation.com/what-is-doxxing-and-why-is-it-so-scary-95849. [Online; accessed 28-September-2021].
[38]
AKM Nuhil Mehdy and Hoda Mehrpouyan. 2020. A User-Centric and Sentiment Aware Privacy-Disclosure Detection Framework based on Multi-input Neural Network. In PrivateNLP@ WSDM. 21--26.
[39]
Nikita Nitin Parab. 2019. Twitter Rumour Detection using Temporal Property of Tweets. Ph.D. Dissertation. Dublin, National College of Ireland.
[40]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543. http://www.aclweb.org/anthology/D14--1162
[41]
Felice Resnik, Amy Bellmore, Jun-Ming Xu, and Xiaojin Zhu. 2016. Celebrities emerge as advocates in tweets about bullying. Translational Issues in Psychological Science, Vol. 2, 3 (2016), 323.
[42]
Eli Rosenberg and Herman Wong. 2017. A police officer fatally shot a man while responding to an emergency call now called a `swatting' prank. https://www.washingtonpost.com/news/post-nation/wp/2017/12/29/a-police-officer-fatally-shot-a-man-while-responding-to-an-emergency-call-now-called-a-swatting-prank. [Online; accessed 28-September-2021].
[43]
Peter Snyder, Periwinkle Doerfler, Chris Kanich, and Damon McCoy. 2017. Fifteen minutes of unwanted fame: Detecting and characterizing doxing. In Proceedings of the 2017 internet measurement conference. 432--444.
[44]
Daniel J Solove. 2007. The future of reputation: Gossip, rumor, and privacy on the Internet. Yale University Press.
[45]
Xuemeng Song, Xiang Wang, Liqiang Nie, Xiangnan He, Zhumin Chen, and Wei Liu. 2018. A personal privacy preserving framework: I let you know who can see what. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 295--304.
[46]
Ananya Srivastava, Mohammed Hasan, Bhargav Yagnik, Rahee Walambe, and Ketan Kotecha. 2021. Role of Artificial Intelligence in Detection of Hateful Speech for Hinglish Data on Social Media. arXiv preprint arXiv:2105.04913 (2021).
[47]
Gianluca Stringhini and Olivier Thonnard. 2015. That ain't you: Blocking spearphishing through behavioral modelling. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 78--97.
[48]
Qiutian Sun and Yabin Xu. 2019. Research on Privacy Concerns of Social Network Users. In 2019 IEEE 5th International Conference on Computer and Communications (ICCC). IEEE, 1453--1460.
[49]
Johnny Torres, Carmen Vaca. 2019. Cross-lingual perspectives about crisis-related conversations on Twitter. In Companion Proceedings of The 2019 World Wide Web Conference. 255--261.
[50]
Prasanna Umar, Anna Squicciarini, and Sarah Rajtmajer. 2019. Detection and analysis of self-disclosure in online news commentaries. In The World Wide Web Conference. 3272--3278.
[51]
David Van Bruwaene, Qianjia Huang, and Diana Inkpen. 2020. A multi-platform dataset for detecting cyberbullying in social media. Language Resources and Evaluation (2020), 1--24.
[52]
Qiaozhi Wang, Hao Xue, Fengjun Li, Dongwon Lee, and Bo Luo. 2019. # DontTweetThis: Scoring Private Information in Social Networks. Proceedings on Privacy Enhancing Technologies, Vol. 2019, 4 (2019), 72--92.
[53]
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop. 88--93.
[54]
Krzysztof Wróbel. 2019. Approaching automatic cyberbullying detection for Polish tweets. (2019).
[55]
Guosheng Xu, Chunhao Qi, Hai Yu, Shengwei Xu, Chunlu Zhao, and Jing Yuan. 2019. Detecting Sensitive Information of Unstructured Text Using Convolutional Neural Network. In 2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). IEEE, 474--479.
[56]
Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies. 656--666.
[57]
H. A. Yajam, Y. K. Ahmadabadi, and M. Akhaee. 2016. PapiaPass: Sentence-based passwords using dependency trees. In 2016 13th International Iranian Society of Cryptology Conference on Information Security and Cryptology (ISCISC). 91--96. https://doi.org/10.1109/ISCISC.2016.7736457
[58]
Seid Muhie Yimam, Hizkiel Mitiku Alemayehu, Abinew Ayele, and Chris Biemann. 2020. Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models. In Proceedings of the 28th International Conference on Computational Linguistics. 1048--1060.
[59]
Kirsten Zeiter, Sandra Pepera, and Molly Middlehurst. 2019. Tweets That Chill: Analyzing Online Violence Against Women in Politics. https://www.ndi.org/tweets-that-chill Publisher: National Democratic Institute.

Cited By

View all
  • (2024)"Just Like, Risking Your Life Here": Participatory Design of User Interactions with Risk Detection AI to Prevent Online-to-Offline Harm Through Dating AppsProceedings of the ACM on Human-Computer Interaction10.1145/36869068:CSCW2(1-41)Online publication date: 8-Nov-2024
  • (2024)A Secure Open-Source Intelligence Framework For Cyberbullying Investigation2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)10.1109/ICAIC60265.2024.10433832(1-8)Online publication date: 7-Feb-2024
  • (2024)Behavioral authentication for security and safetySecurity and Safety10.1051/sands/20240033(2024003)Online publication date: 30-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 6, Issue CSCW2
CSCW
November 2022
8205 pages
EISSN:2573-0142
DOI:10.1145/3571154
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2022
Published in PACMHCI Volume 6, Issue CSCW2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cyberbullying
  2. doxing
  3. hate speech
  4. online harassment
  5. privacy
  6. private information
  7. social network
  8. twitter

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)192
  • Downloads (Last 6 weeks)34
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)"Just Like, Risking Your Life Here": Participatory Design of User Interactions with Risk Detection AI to Prevent Online-to-Offline Harm Through Dating AppsProceedings of the ACM on Human-Computer Interaction10.1145/36869068:CSCW2(1-41)Online publication date: 8-Nov-2024
  • (2024)A Secure Open-Source Intelligence Framework For Cyberbullying Investigation2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)10.1109/ICAIC60265.2024.10433832(1-8)Online publication date: 7-Feb-2024
  • (2024)Behavioral authentication for security and safetySecurity and Safety10.1051/sands/20240033(2024003)Online publication date: 30-Apr-2024
  • (2024)A longitudinal dataset and analysis of Twitter ISIS users and propagandaSocial Network Analysis and Mining10.1007/s13278-023-01177-714:1Online publication date: 3-Jan-2024
  • (2023)Towards Trauma-Informed Data Donation of Sexual Experience in Online Dating to Improve Sexual Risk Detection AIAdjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586182.3616689(1-3)Online publication date: 29-Oct-2023
  • (2023)Systematizing the State of Knowledge in Detecting Privacy Sensitive Information in Unstructured Texts using Machine Learning2023 20th Annual International Conference on Privacy, Security and Trust (PST)10.1109/PST58708.2023.10320187(1-7)Online publication date: 21-Aug-2023
  • (2023)Cyberbullying in text content detection: an analytical reviewInternational Journal of Computers and Applications10.1080/1206212X.2023.225604845:9(579-586)Online publication date: 14-Sep-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media