short-paper

Open access

GeekMAN: Geek-oriented username Matching Across online Networks

Authors:

Md Rayhanul Masud,

Michalis FaloutsosAuthors Info & Claims

ASONAM '23: Proceedings of the International Conference on Advances in Social Networks Analysis and Mining

Pages 305 - 309

https://doi.org/10.1145/3625007.3627498

Published: 15 March 2024 Publication History

Abstract

How can we identify malicious hackers participating in different online platforms using their usernames only? Disambiguating users across online platforms (e.g. security forums, GitHub, YouTube) is an essential capability for tracking malicious hackers. Although a hacker could pick arbitrary names on different platforms, they often use the same or similar usernames as this helps them establish an online "brand". We propose GeekMAN, a systematic human-inspired approach to identify similar usernames across online platforms focusing on technogeek platforms. The key novelty consists of the development and integration of three capabilities: (a) decomposing usernames into meaningful chunks, (b) de-obfuscating technical and slang conventions, and (c) considering all the different outcomes of the two previous functions exhaustively when calculating the similarity. We conduct a study using 1.2M usernames from five security forums. Our method outperforms previous methods with a Precision of 81--86%. We see our approach as a fundamental research capability, which we made publicly available on GitHub.

References

[1]

S. Samtani and H. Chen, "Using social network analysis to identify key hackers for keylogging tools in hacker forums," in ISI, IEEE, 2016.

[2]

R. Islam, M. O. F. Rokon, A. Darki, and M. Faloutsos, "HackerScope: The dynamics of a massive hacker online ecosystem," SNAM, 2021.

[3]

J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "RIPEx: Extracting malicious ip addresses from security forums using cross-forum learning," in PAKDD, 2018.

[4]

Y. Wang, T. Liu, Q. Tan, J. Shi, and L. Guo, "Identifying users across different sites using usernames," Procedia Computer Science, 2016.

[5]

Y. Li, Y. Peng, Z. Zhang, H. Yin, and Q. Xu, "Matching user accounts across social networks based on username and display name," World Wide Web, 2019.

[6]

GeekMAN, 2023. [Online]. Available: https://github.com/mrayhanulmasud/geekman

[7]

B. Treves, M. R. Masud, and M. Faloutsos, "URLytics: Profiling forum users from their posted urls," in ASONAM. IEEE, 2022.

[8]

M. O. F. Rokon, R. Islam, M. R. Masud, and M. Faloutsos, "PIMan: A comprehensive approach for establishing plausible influence among software repositories," in ASONAM. IEEE, 2022.

[9]

E. Mariconti, J. Onaolapo, S. S. Ahmad, N. Nikiforou, M. Egele, N. Nikiforakis, and G. Stringhini, "What's in a name? understanding profile name reuse on twitter," in World Wide Web, 2017.

[10]

R. Islam, M. O. F. Rokon, E. E. Papalexakis, and M. Faloutsos, "Recten: A recursive hierarchical low rank tensor factorization method to discover hierarchical patterns from multi-modal data," in ICWSM, 2021.

[11]

FBI, 2020. [Online]. Available: https://www.fbi.gov/wanted/cyber/behzad-mohammadzadeh

[12]

Garage4Hackers, 2021. [Online]. Available: http://garage4hackers.com

[13]

O. Community, 2021. [Online]. Available: http://offensivecommunity.net

[14]

RaidForums., 2021. [Online]. Available: https://raidforums.com

[15]

M. G. Hacking, 2021. [Online]. Available: https://www.mpgh.net

[16]

Hackforums, 2021. [Online]. Available: https://hackforums.net

[17]

CambridgeCybercrimeCentre, 2022. [Online]. Available: https://www.cambridgecybercrime.uk

[18]

Wikipedia, 2023. [Online]. Available: https://en.wikipedia.org/wiki/Leet

[19]

P. A. Hall and G. R. Dowling, "Approximate string matching," ACM computing surveys (CSUR), 1980.

[20]

V. I. Levenshtein et al., "Binary codes capable of correcting deletions, insertions, and reversals," in Soviet physics doklady, 1966.

[21]

J. Gharibshah, E. E. Papalexakis, and M. Faloutsos, "REST: A thread embedding approach for identifying and classifying user-specified information in security forums," in ICWSM, 2020.

[22]

A. E. Monge, C. Elkan et al., "The field matching problem: algorithms and applications." in KDD, 1996.

[23]

M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg, "Adaptive name matching in information integration," IS, 2003.

Digital Library

[24]

K. E. Emam, "Benchmarking kappa: Interrater agreement in software process assessments," Empirical Software Engineering, 1999.

[25]

D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils, "How unique and traceable are usernames?" in PETS. Springer, 2011.

Digital Library

[26]

R. Zafarani and H. Liu, "Connecting users across social media sites: a behavioral-modeling approach," in ACM KDD, 2013.

Digital Library

[27]

J. Vosecky, D. Hong, and V. Y. Shen, "User identification across multiple social networks," in Networked Digital Technologies. IEEE, 2009.

[28]

O. Goga, D. Perito, H. Lei, R. Teixeira, and R. Sommer, "Large-scale correlation of accounts across social networks," University of California at Berkeley, Berkeley, California, Tech. Rep. TR-13-002, 2013.

[29]

H. Zhang, M.-Y. Kan, Y. Liu, and S. Ma, "Online social network profile linkage," in Asia Information Retrieval Symposium. Springer, 2014.

[30]

X. Mu, F. Zhu, E.-P. Lim, J. Xiao, J. Wang, and Z.-H. Zhou, "User identity linkage by latent user space modelling," in ACM SIGKDD, 2016.

[31]

A. Malhotra, L. Totti, W. Meira Jr, P. Kumaraguru, and V. Almeida, "Studying user footprints in different online social networks," in ASONAM, 2012.

[32]

X. Zhou, X. Liang, H. Zhang, and Y. Ma, "Cross-platform identification of anonymous identical users in multiple social media networks," IEEE transactions on knowledge and data engineering, 2015.

[33]

R. Zafarani, L. Tang, and H. Liu, "User identification across social media," ACM TKDD, 2015.

[34]

Y. Zhang, J. Tang, Z. Yang, J. Pei, and P. S. Yu, "COSNET: Connecting heterogeneous social networks with local and global consistency," in ACM SIGKDD, 2015.

Digital Library

[35]

L. Liu, W. K. Cheung, X. Li, and L. Liao, "Aligning users across social networks using network embedding." in IJCAI, 2016.

[36]

J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon, "What's in a name? an unsupervised approach to link users across communities," in WSDM, 2013.

[37]

P. Jain, P. Kumaraguru, and A. Joshi, "@ i seek'fb. me' identifying users across multiple online social networks," in WWW, 2013.

[38]

O. Goga, P. Loiseau, R. Sommer, R. Teixeira, and K. P. Gummadi, "On the reliability of profile matching across large online social networks," in ACM SIGKDD, 2015.

[39]

E. Arabnezhad, M. La Morgia, A. Mei, E. N. Nemmi, and J. Stefa, "A light in the dark web: Linking dark web aliases to real internet identities," in ICDCS, 2020.

[40]

J. Cabrero-Holgueras and S. Pastrana, "A methodology for large-scale identification of related accounts in underground forums," Computers & Security, 2021.

Index Terms

GeekMAN: Geek-oriented username Matching Across online Networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASONAM '23: Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

November 2023

835 pages

ISBN:9798400704093

DOI:10.1145/3625007

Chair:
Jon Rokne,
Program Chair:
Dong Wang

Copyright © 2023 Owner/Author(s).

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2024

Check for updates

Qualifiers

Short-paper

Funding Sources

NSF

Conference

ASONAM '23

Sponsor:

SIGKDD

ASONAM '23: International Conference on Advances in Social Networks Analysis and Mining

November 6 - 9, 2023

Kusadasi, Turkiye

Acceptance Rates

ASONAM '23 Paper Acceptance Rate 53 of 145 submissions, 37%;

Overall Acceptance Rate 116 of 549 submissions, 21%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
136
Total Downloads

Downloads (Last 12 months)136
Downloads (Last 6 weeks)16

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten