Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3465481.3470026acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Discovery of Single-Vendor Marketplace Operators in the Tor-Network

Published: 17 August 2021 Publication History

Abstract

In the Tor-network are many single-vendor marketplace web sites with a wide range of offers. Some of these vendor websites could be hosted by the same operators. In this paper, a method is presented to find out similarities between these vendor websites to discover possible operational structures between them. In order to accomplish this, similarity values are determined between the darknet websites by combining various features from the different categories structure, content and metadata. A dataset is determined by a first execution of the method and manual validation. Based on this data set, important features are extracted using decision trees. The features of the category structure HTML-Tag, HTML-Class, HTML-DOM-Tree as well as the metadata features File Content and Links-To have proven to be particularly important and can very effectively highlight similarities between darknet web sites. Supported by the similarity detection method, it was found that only 49% of 258 single-vendor marketplaces were unique, i.e. no similar sites existed. In addition, it was possible to find several duplicates of vendor websites, which made up 20%.

References

[1]
Mhd Wesam Al-Nabki, Eduardo Fidalgo, Enrique Alegre, and Laura Fernández-Robles. 2019. Torank: Identifying the most influential suspicious domains in the tor network. Expert Systems with Applications 123 (2019), 212–226.
[2]
Monica Bianchini, Marco Gori, and Franco Scarselli. 2005. Inside PageRank. ACM Transactions on Internet Technology 5, 1 (2005), 92–128. https://doi.org/10.1145/1052934.1052938
[3]
Julian Broséus, Damien Rhumorbarbe, Marie Morelato, Ludovic Staehli, and Quentin Rossy. 2017. A geographical analysis of trafficking on a popular darknet market. Forensic Science International 277 (2017), 88–102. https://doi.org/10.1016/j.forsciint.2017.05.021
[4]
David Buttler. 2004. A short survey of document structure similarity algorithms. Proceedings of the International Conference on Internet Computing, IC’04 1 (2004), 3–9.
[5]
Guangyu Chen and Ben Choi. 2008. Web page genre classification. Proceedings of the ACM Symposium on Applied Computing (2008), 2353–2357. https://doi.org/10.1145/1363686.1364247
[6]
Ian Clarke, Oskar Sandberg, Matthew Toseland, and Vilhelm Verendel. 2010. Private communication through a network of trusted connections: The dark freenet. Network (2010). https://doi.org/10.1.1.695.40
[7]
Jürgen Cleve and Uwe Lämmel. 2014. Data Mining. De Gruyter Oldenbourg.
[8]
Isabel F. Cruz, Slava Borisov, Michael A. Marks, and Timothy R. Webb. 1998. Measuring structural similarity among web documents: Preliminary results. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1375 (1998), 513–524. https://doi.org/10.1007/BFb0053296
[9]
Edwin Dauber, Aylin Caliskan, Richard Harang, Gregory Shearer, Michael Weisman, Frederica Nelson, and Rachel Greenstadt. 2019. Git Blame Who?: Stylistic Authorship Attribution of Small, Incomplete Source Code Fragments. Proceedings on Privacy Enhancing Technologies 2019, 3 (jul 2019), 389–408. https://doi.org/10.2478/popets-2019-0053 arxiv:1701.05681
[10]
Roger Dingledine, Nick Mathewson, and Paul Syverson. 2004. Tor: The second-generation onion router. SSYM’04 Proceedings of the 13th conference on USENIX Security Symposium 13 (2004), 21. https://doi.org/10.1.1.4.6896
[11]
Romain Espinosa. 2019. Scamming and the reputation of drug dealers on Darknet Markets. International Journal of Industrial Organization 67 (2019). https://doi.org/10.1016/j.ijindorg.2019.102523
[12]
Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web (2013), 447–457. https://doi.org/10.1145/2488388.2488428
[13]
A. D. Gordon, L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees.Biometrics 40, 3 (sep 1984), 874. https://doi.org/10.2307/2530946
[14]
Ramzi A Haraty and Bassam Zantout. 2002. I2P Data Communication System Damage Assessment and Recovery from Malicious Attacks for Defensive Information Warfare View project High-Performance and Accurate Mathematical Solvers in Hardware View project I2P Data Communication System. (2002). https://www.researchgate.net/publication/268253957
[15]
Thanh Nghia Ho and Wee Keong Ng. 2016. Application of stylometry to DarkWeb forum user identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9977 LNCS. Springer Verlag, 173–183. https://doi.org/10.1007/978-3-319-50011-9_14
[16]
Ankit Kumar Jain and B. B. Gupta. 2017. Phishing detection: Analysis of visual similarity based approaches. Security and Communication Networks 2017, i (2017). https://doi.org/10.1155/2017/5421046
[17]
Jane Yung jen Hsu and Wen tau Yih. 1997. Template-based information mining from HTML documents. Proceedings of the National Conference on Artificial Intelligence (1997), 256–262.
[18]
Min Hyung Lee, Yeon Seok Kim, and Kyong Ho Lee. 2007. Logical structure analysis: From HTML to XML. Computer Standards and Interfaces 29, 1 (2007), 109–124. https://doi.org/10.1016/j.csi.2006.02.001
[19]
Michael Levandowsky and David Winter. 1971. Distance between sets. Nature 234, 5323 (1971), 34–35. https://doi.org/10.1038/234034a0
[20]
Chul Su Lim, Kong Joo Lee, and Gil Chang Kim. 2005. Multiple sets of features for automatic genre classification of web documents. Information Processing and Management 41, 5 (2005), 1263–1276. https://doi.org/10.1016/j.ipm.2004.06.004
[21]
Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding variable importances in Forests of randomized trees. Advances in Neural Information Processing Systems (2013), 1–9.
[22]
Steve Mansfield-Devine. 2009. Darknets. Computer Fraud and Security 2009, 12 (2009), 4–6. https://doi.org/10.1016/S1361-3723(09)70150-2
[23]
Mhd Wesam Al Nabki, Eduardo Fidalgo, Enrique Alegre, and Ivan De Paz. 2017. Classifying illegal activities on tor network based on web textual contents. In 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference, Vol. 1. 35–43. https://doi.org/10.18653/v1/e17-1004
[24]
Mateusz Pawlik and Nikolaus Augsten. 2016. Tree edit distance: Robust and memory-efficient. Information Systems 56(2016), 157–173. https://doi.org/10.1016/j.is.2015.08.004
[25]
Ramyaa, Congzhou He, and Khaled Rasheed. 2004. Using machine learning techniques for stylometry. Proceedings of the International Conference on Artificial Intelligence, IC-AI’04 2(2004), 897–903.
[26]
Angelo P.E. Rosiello, Engin Kirda, Christopher Kruegel, and Fabrizio Ferrandi. 2007. A layout-similarity-based approach for detecting phishing pages. Proceedings of the 3rd International Conference on Security and Privacy in Communication Networks, SecureComm(2007), 454–463. https://doi.org/10.1109/SECCOM.2007.4550367
[27]
Dennis Shasha, JT-L Wang, Kaizhong Zhang, and Frank Y Shih. 1994. Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man, and Cybernetics 24, 4(1994), 668–678.
[28]
Martijn Spitters, Femke Klaver, Gijs Koot, and Mark Van Staalduinen. 2016. Authorship Analysis on Dark Marketplace Forums. In Proceedings - 2015 European Intelligence and Security Informatics Conference, EISIC 2015. Institute of Electrical and Electronics Engineers Inc., 1–8. https://doi.org/10.1109/EISIC.2015.47
[29]
Efstathios Stamatatos. 2009. A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology 60, 3 (mar 2009), 538–556. https://doi.org/10.1002/asi.21001
[30]
Martin Steinebach, Marcel Schäfer, Alexander Karakuz, and Katharina Brandl. 2020. Detection and Analysis of Tor Onion Services. Journal of Cyber Security and Mobility 9, 1 (2020), 141–174. https://doi.org/10.13052/JCSM2245-1439.915
[31]
Jiří Štěpánekand Monika Šimková. 2013. Comparing Web Pages in Terms of Inner Structure. Procedia - Social and Behavioral Sciences 83 (2013), 458–462. https://doi.org/10.1016/j.sbspro.2013.06.090
[32]
Xiao Hui Tai, Kyle Soska, and Nicolas Christin. 2019. Adversarial matching of dark net market vendor accounts. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019), 1871–1880. https://doi.org/10.1145/3292500.3330763
[33]
Anastasios Tombros and Zeeshan Ali. 2005. Factors affecting Web page similarity. Lecture Notes in Computer Science 3408 (2005), 487–501. https://doi.org/10.1007/978-3-540-31865-1_35
[34]
Vedrana Vidulin, M Lustrek, and M Gams. 2009. Multi-Label Approaches to Web Genre Identification.Jlcl 24, 1 (2009), 97–114. http://dis.ijs.si/vedrana/wp-content/uploads/2011/02/vidulin_lustrek_gams-jlcl.pdf
[35]
Xiangwen Wang, Gang Wang, Michel J Pleimling, and Danfeng Yao. 2018. Photo-based Vendor Re-identification on Darknet Marketplaces using Deep Neural Networks. (2018). https://vtechworks.lib.vt.edu/bitstream/handle/10919/83447/Wang_X_T_2018.pdf?sequence=3
[36]
Daniel Watson. 2019. Source Code Stylometry and Authorship Attribution for Open Source. (2019). https://uwspace.uwaterloo.ca/handle/10012/15134
[37]
Jessica Wood. 2010. The Darknet: A Digital Copyright Revolution. Richmond Journal of Law and Technology 16, 4 (2010), 14.
[38]
Yudong Yang and Hong Jiang Zhang. 2001. HTML page analysis based on visual cues. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2001-Janua, 49 (2001), 859–864. https://doi.org/10.1109/ICDAR.2001.953909
[39]
York Yannikos, Annika Schäfer, and Martin Steinebach. 2018. Monitoring product sales in darknet shops. ACM International Conference Proceeding Series (2018). https://doi.org/10.1145/3230833.3233258
[40]
Kaizhong Zhang and Dennis Shasha. 1989. Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 6 (1989), 1245–1262. https://doi.org/10.1137/0218082
[41]
Yiming Zhang, Yujie Fan, Liang Zhao, Wei Song, Shifu Hou, Chuan Shi, Yanfang Ye, Xin Li, Jiabin Wang, and Qi Xiong. 2019. Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets over attributed heterogeneous information network. The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (2019), 3448–3454. https://doi.org/10.1145/3308558.3313537

Cited By

View all
  • (2024)Navigating the Shadows: Manual and Semi-Automated Evaluation of the Dark Web for Cyber Threat IntelligenceIEEE Access10.1109/ACCESS.2024.344824712(118903-118922)Online publication date: 2024
  • (2023)Trust Assessment of a Darknet Marketplace2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00245(1806-1813)Online publication date: 1-Nov-2023
  • (2022)SoK: An Evaluation of the Secure End User Experience on the Dark Net through Systematic Literature ReviewJournal of Cybersecurity and Privacy10.3390/jcp20200182:2(329-357)Online publication date: 27-May-2022

Index Terms

  1. Discovery of Single-Vendor Marketplace Operators in the Tor-Network
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security
    August 2021
    1447 pages
    ISBN:9781450390514
    DOI:10.1145/3465481
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Tor
    2. darknet offer
    3. feature importance
    4. similarity detection
    5. vendor sites

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Federal Ministry of Education and Research

    Conference

    ARES 2021

    Acceptance Rates

    Overall Acceptance Rate 228 of 451 submissions, 51%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Navigating the Shadows: Manual and Semi-Automated Evaluation of the Dark Web for Cyber Threat IntelligenceIEEE Access10.1109/ACCESS.2024.344824712(118903-118922)Online publication date: 2024
    • (2023)Trust Assessment of a Darknet Marketplace2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)10.1109/TrustCom60117.2023.00245(1806-1813)Online publication date: 1-Nov-2023
    • (2022)SoK: An Evaluation of the Secure End User Experience on the Dark Net through Systematic Literature ReviewJournal of Cybersecurity and Privacy10.3390/jcp20200182:2(329-357)Online publication date: 27-May-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media