GeekMAN: Geek-oriented username Matching Across online Networks
Pages 305 - 309
Abstract
How can we identify malicious hackers participating in different online platforms using their usernames only? Disambiguating users across online platforms (e.g. security forums, GitHub, YouTube) is an essential capability for tracking malicious hackers. Although a hacker could pick arbitrary names on different platforms, they often use the same or similar usernames as this helps them establish an online "brand". We propose GeekMAN, a systematic human-inspired approach to identify similar usernames across online platforms focusing on technogeek platforms. The key novelty consists of the development and integration of three capabilities: (a) decomposing usernames into meaningful chunks, (b) de-obfuscating technical and slang conventions, and (c) considering all the different outcomes of the two previous functions exhaustively when calculating the similarity. We conduct a study using 1.2M usernames from five security forums. Our method outperforms previous methods with a Precision of 81--86%. We see our approach as a fundamental research capability, which we made publicly available on GitHub.
References
[1]
S. Samtani and H. Chen, "Using social network analysis to identify key hackers for keylogging tools in hacker forums," in ISI, IEEE, 2016.
[2]
R. Islam, M. O. F. Rokon, A. Darki, and M. Faloutsos, "HackerScope: The dynamics of a massive hacker online ecosystem," SNAM, 2021.
[3]
J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "RIPEx: Extracting malicious ip addresses from security forums using cross-forum learning," in PAKDD, 2018.
[4]
Y. Wang, T. Liu, Q. Tan, J. Shi, and L. Guo, "Identifying users across different sites using usernames," Procedia Computer Science, 2016.
[5]
Y. Li, Y. Peng, Z. Zhang, H. Yin, and Q. Xu, "Matching user accounts across social networks based on username and display name," World Wide Web, 2019.
[6]
GeekMAN, 2023. [Online]. Available: https://github.com/mrayhanulmasud/geekman
[7]
B. Treves, M. R. Masud, and M. Faloutsos, "URLytics: Profiling forum users from their posted urls," in ASONAM. IEEE, 2022.
[8]
M. O. F. Rokon, R. Islam, M. R. Masud, and M. Faloutsos, "PIMan: A comprehensive approach for establishing plausible influence among software repositories," in ASONAM. IEEE, 2022.
[9]
E. Mariconti, J. Onaolapo, S. S. Ahmad, N. Nikiforou, M. Egele, N. Nikiforakis, and G. Stringhini, "What's in a name? understanding profile name reuse on twitter," in World Wide Web, 2017.
[10]
R. Islam, M. O. F. Rokon, E. E. Papalexakis, and M. Faloutsos, "Recten: A recursive hierarchical low rank tensor factorization method to discover hierarchical patterns from multi-modal data," in ICWSM, 2021.
[11]
FBI, 2020. [Online]. Available: https://www.fbi.gov/wanted/cyber/behzad-mohammadzadeh
[12]
Garage4Hackers, 2021. [Online]. Available: http://garage4hackers.com
[13]
O. Community, 2021. [Online]. Available: http://offensivecommunity.net
[14]
RaidForums., 2021. [Online]. Available: https://raidforums.com
[15]
M. G. Hacking, 2021. [Online]. Available: https://www.mpgh.net
[16]
Hackforums, 2021. [Online]. Available: https://hackforums.net
[17]
CambridgeCybercrimeCentre, 2022. [Online]. Available: https://www.cambridgecybercrime.uk
[18]
Wikipedia, 2023. [Online]. Available: https://en.wikipedia.org/wiki/Leet
[19]
P. A. Hall and G. R. Dowling, "Approximate string matching," ACM computing surveys (CSUR), 1980.
[20]
V. I. Levenshtein et al., "Binary codes capable of correcting deletions, insertions, and reversals," in Soviet physics doklady, 1966.
[21]
J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "REST: A thread embedding approach for identifying and classifying user-specified information in security forums," in ICWSM, 2020.
[22]
A. E. Monge, C. Elkan et al., "The field matching problem: algorithms and applications." in KDD, 1996.
[23]
M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg, "Adaptive name matching in information integration," IS, 2003.
[24]
K. E. Emam, "Benchmarking kappa: Interrater agreement in software process assessments," Empirical Software Engineering, 1999.
[25]
D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils, "How unique and traceable are usernames?" in PETS. Springer, 2011.
[26]
R. Zafarani and H. Liu, "Connecting users across social media sites: a behavioral-modeling approach," in ACM KDD, 2013.
[27]
J. Vosecky, D. Hong, and V. Y. Shen, "User identification across multiple social networks," in Networked Digital Technologies. IEEE, 2009.
[28]
O. Goga, D. Perito, H. Lei, R. Teixeira, and R. Sommer, "Large-scale correlation of accounts across social networks," University of California at Berkeley, Berkeley, California, Tech. Rep. TR-13-002, 2013.
[29]
H. Zhang, M.-Y. Kan, Y. Liu, and S. Ma, "Online social network profile linkage," in Asia Information Retrieval Symposium. Springer, 2014.
[30]
X. Mu, F. Zhu, E.-P. Lim, J. Xiao, J. Wang, and Z.-H. Zhou, "User identity linkage by latent user space modelling," in ACM SIGKDD, 2016.
[31]
A. Malhotra, L. Totti, W. Meira Jr, P. Kumaraguru, and V. Almeida, "Studying user footprints in different online social networks," in ASONAM, 2012.
[32]
X. Zhou, X. Liang, H. Zhang, and Y. Ma, "Cross-platform identification of anonymous identical users in multiple social media networks," IEEE transactions on knowledge and data engineering, 2015.
[33]
R. Zafarani, L. Tang, and H. Liu, "User identification across social media," ACM TKDD, 2015.
[34]
Y. Zhang, J. Tang, Z. Yang, J. Pei, and P. S. Yu, "COSNET: Connecting heterogeneous social networks with local and global consistency," in ACM SIGKDD, 2015.
[35]
L. Liu, W. K. Cheung, X. Li, and L. Liao, "Aligning users across social networks using network embedding." in IJCAI, 2016.
[36]
J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon, "What's in a name? an unsupervised approach to link users across communities," in WSDM, 2013.
[37]
P. Jain, P. Kumaraguru, and A. Joshi, "@ i seek'fb. me' identifying users across multiple online social networks," in WWW, 2013.
[38]
O. Goga, P. Loiseau, R. Sommer, R. Teixeira, and K. P. Gummadi, "On the reliability of profile matching across large online social networks," in ACM SIGKDD, 2015.
[39]
E. Arabnezhad, M. La Morgia, A. Mei, E. N. Nemmi, and J. Stefa, "A light in the dark web: Linking dark web aliases to real internet identities," in ICDCS, 2020.
[40]
J. Cabrero-Holgueras and S. Pastrana, "A methodology for large-scale identification of related accounts in underground forums," Computers & Security, 2021.
Index Terms
- GeekMAN: Geek-oriented username Matching Across online Networks
Index terms have been assigned to the content through auto-classification.
Recommendations
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In

Copyright © 2023 Owner/Author(s).
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 15 March 2024
Check for updates
Qualifiers
- Short-paper
Funding Sources
- NSF
Conference
ASONAM '23
Sponsor:
ASONAM '23: International Conference on Advances in Social Networks Analysis and Mining
November 6 - 9, 2023
Kusadasi, Turkiye
Acceptance Rates
ASONAM '23 Paper Acceptance Rate 53 of 145 submissions, 37%;
Overall Acceptance Rate 116 of 549 submissions, 21%
Upcoming Conference
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 136Total Downloads
- Downloads (Last 12 months)136
- Downloads (Last 6 weeks)16
Reflects downloads up to 18 Feb 2025
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in