Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3625007.3627498acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
short-paper
Open access

GeekMAN: Geek-oriented username Matching Across online Networks

Published: 15 March 2024 Publication History

Abstract

How can we identify malicious hackers participating in different online platforms using their usernames only? Disambiguating users across online platforms (e.g. security forums, GitHub, YouTube) is an essential capability for tracking malicious hackers. Although a hacker could pick arbitrary names on different platforms, they often use the same or similar usernames as this helps them establish an online "brand". We propose GeekMAN, a systematic human-inspired approach to identify similar usernames across online platforms focusing on technogeek platforms. The key novelty consists of the development and integration of three capabilities: (a) decomposing usernames into meaningful chunks, (b) de-obfuscating technical and slang conventions, and (c) considering all the different outcomes of the two previous functions exhaustively when calculating the similarity. We conduct a study using 1.2M usernames from five security forums. Our method outperforms previous methods with a Precision of 81--86%. We see our approach as a fundamental research capability, which we made publicly available on GitHub.

References

[1]
S. Samtani and H. Chen, "Using social network analysis to identify key hackers for keylogging tools in hacker forums," in ISI, IEEE, 2016.
[2]
R. Islam, M. O. F. Rokon, A. Darki, and M. Faloutsos, "HackerScope: The dynamics of a massive hacker online ecosystem," SNAM, 2021.
[3]
J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "RIPEx: Extracting malicious ip addresses from security forums using cross-forum learning," in PAKDD, 2018.
[4]
Y. Wang, T. Liu, Q. Tan, J. Shi, and L. Guo, "Identifying users across different sites using usernames," Procedia Computer Science, 2016.
[5]
Y. Li, Y. Peng, Z. Zhang, H. Yin, and Q. Xu, "Matching user accounts across social networks based on username and display name," World Wide Web, 2019.
[6]
GeekMAN, 2023. [Online]. Available: https://github.com/mrayhanulmasud/geekman
[7]
B. Treves, M. R. Masud, and M. Faloutsos, "URLytics: Profiling forum users from their posted urls," in ASONAM. IEEE, 2022.
[8]
M. O. F. Rokon, R. Islam, M. R. Masud, and M. Faloutsos, "PIMan: A comprehensive approach for establishing plausible influence among software repositories," in ASONAM. IEEE, 2022.
[9]
E. Mariconti, J. Onaolapo, S. S. Ahmad, N. Nikiforou, M. Egele, N. Nikiforakis, and G. Stringhini, "What's in a name? understanding profile name reuse on twitter," in World Wide Web, 2017.
[10]
R. Islam, M. O. F. Rokon, E. E. Papalexakis, and M. Faloutsos, "Recten: A recursive hierarchical low rank tensor factorization method to discover hierarchical patterns from multi-modal data," in ICWSM, 2021.
[11]
FBI, 2020. [Online]. Available: https://www.fbi.gov/wanted/cyber/behzad-mohammadzadeh
[12]
Garage4Hackers, 2021. [Online]. Available: http://garage4hackers.com
[13]
O. Community, 2021. [Online]. Available: http://offensivecommunity.net
[14]
RaidForums., 2021. [Online]. Available: https://raidforums.com
[15]
M. G. Hacking, 2021. [Online]. Available: https://www.mpgh.net
[16]
Hackforums, 2021. [Online]. Available: https://hackforums.net
[17]
CambridgeCybercrimeCentre, 2022. [Online]. Available: https://www.cambridgecybercrime.uk
[18]
Wikipedia, 2023. [Online]. Available: https://en.wikipedia.org/wiki/Leet
[19]
P. A. Hall and G. R. Dowling, "Approximate string matching," ACM computing surveys (CSUR), 1980.
[20]
V. I. Levenshtein et al., "Binary codes capable of correcting deletions, insertions, and reversals," in Soviet physics doklady, 1966.
[21]
J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "REST: A thread embedding approach for identifying and classifying user-specified information in security forums," in ICWSM, 2020.
[22]
A. E. Monge, C. Elkan et al., "The field matching problem: algorithms and applications." in KDD, 1996.
[23]
M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg, "Adaptive name matching in information integration," IS, 2003.
[24]
K. E. Emam, "Benchmarking kappa: Interrater agreement in software process assessments," Empirical Software Engineering, 1999.
[25]
D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils, "How unique and traceable are usernames?" in PETS. Springer, 2011.
[26]
R. Zafarani and H. Liu, "Connecting users across social media sites: a behavioral-modeling approach," in ACM KDD, 2013.
[27]
J. Vosecky, D. Hong, and V. Y. Shen, "User identification across multiple social networks," in Networked Digital Technologies. IEEE, 2009.
[28]
O. Goga, D. Perito, H. Lei, R. Teixeira, and R. Sommer, "Large-scale correlation of accounts across social networks," University of California at Berkeley, Berkeley, California, Tech. Rep. TR-13-002, 2013.
[29]
H. Zhang, M.-Y. Kan, Y. Liu, and S. Ma, "Online social network profile linkage," in Asia Information Retrieval Symposium. Springer, 2014.
[30]
X. Mu, F. Zhu, E.-P. Lim, J. Xiao, J. Wang, and Z.-H. Zhou, "User identity linkage by latent user space modelling," in ACM SIGKDD, 2016.
[31]
A. Malhotra, L. Totti, W. Meira Jr, P. Kumaraguru, and V. Almeida, "Studying user footprints in different online social networks," in ASONAM, 2012.
[32]
X. Zhou, X. Liang, H. Zhang, and Y. Ma, "Cross-platform identification of anonymous identical users in multiple social media networks," IEEE transactions on knowledge and data engineering, 2015.
[33]
R. Zafarani, L. Tang, and H. Liu, "User identification across social media," ACM TKDD, 2015.
[34]
Y. Zhang, J. Tang, Z. Yang, J. Pei, and P. S. Yu, "COSNET: Connecting heterogeneous social networks with local and global consistency," in ACM SIGKDD, 2015.
[35]
L. Liu, W. K. Cheung, X. Li, and L. Liao, "Aligning users across social networks using network embedding." in IJCAI, 2016.
[36]
J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon, "What's in a name? an unsupervised approach to link users across communities," in WSDM, 2013.
[37]
P. Jain, P. Kumaraguru, and A. Joshi, "@ i seek'fb. me' identifying users across multiple online social networks," in WWW, 2013.
[38]
O. Goga, P. Loiseau, R. Sommer, R. Teixeira, and K. P. Gummadi, "On the reliability of profile matching across large online social networks," in ACM SIGKDD, 2015.
[39]
E. Arabnezhad, M. La Morgia, A. Mei, E. N. Nemmi, and J. Stefa, "A light in the dark web: Linking dark web aliases to real internet identities," in ICDCS, 2020.
[40]
J. Cabrero-Holgueras and S. Pastrana, "A methodology for large-scale identification of related accounts in underground forums," Computers & Security, 2021.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASONAM '23: Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
November 2023
835 pages
ISBN:9798400704093
DOI:10.1145/3625007
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2024

Check for updates

Qualifiers

  • Short-paper

Funding Sources

  • NSF

Conference

ASONAM '23
Sponsor:

Acceptance Rates

ASONAM '23 Paper Acceptance Rate 53 of 145 submissions, 37%;
Overall Acceptance Rate 116 of 549 submissions, 21%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 136
    Total Downloads
  • Downloads (Last 12 months)136
  • Downloads (Last 6 weeks)16
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media