Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3339252.3340517acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Methodology for the Automated Metadata-Based Classification of Incriminating Digital Forensic Artefacts

Published: 26 August 2019 Publication History

Abstract

The ever increasing volume of data in digital forensic investigation is one of the most discussed challenges in the field. Usually, most of the file artefacts on seized devices are not pertinent to the investigation. Manually retrieving suspicious files relevant to the investigation is akin to finding a needle in a haystack. In this paper, a methodology for the automatic prioritisation of suspicious file artefacts (i.e., file artefacts that are pertinent to the investigation) is proposed to reduce the manual analysis effort required. This methodology is designed to work in a human-in-the-loop fashion. In other words, it predicts/recommends that an artefact is likely to be suspicious rather than giving the final analysis result. A supervised machine learning approach is employed, which leverages the recorded results of previously processed cases. The process of features extraction, dataset generation, training and evaluation are presented in this paper. In addition, a toolkit for data extraction from disk images is outlined, which enables this method to be integrated with the conventional investigation process and work in an automated fashion.

References

[1]
Cory Altheide and Harlan Carvey. 2011. Digital forensics with open source tools. Elsevier.
[2]
Nicole Beebe. 2009. Digital forensic research: The good, the bad and the unaddressed. In IFIP International Conference on Digital Forensics. Springer, 17--36.
[3]
Andrew Case, Andrew Cristina, Lodovico Marziale, Golden G Richard, and Vassil Roussev. 2008. FACE: Automated digital evidence discovery and correlation. Digital Investigation 5 (2008), S65--S75.
[4]
Eoghan Casey. 2011. Digital evidence and computer crime: Forensic science, computers, and the internet. Academic Press.
[5]
Lei Chen, Hassan Takabi, and Nhien-An Le-Khac. 2019. Security, Privacy, and Digital Forensics in the Cloud. John Wiley & Sons.
[6]
Luís Filipe da Cruz Nassif and Eduardo Raul Hruschka. 2013. Document clustering for forensic analysis: an approach for improving computer inspection. IEEE Transactions on Information Forensics and Security 8, 1 (2013), 46--54.
[7]
Xiaoyu Du, Nhien-An Le-Khac, and Mark Scanlon. 2017. Evaluation of Digital Forensic Process Models with Respect to Digital Forensics as a Service. In Proceedings of the 16th European Conference on Cyber Warfare and Security (ECCWS 2017). ACPI, Dublin, Ireland, 573--581.
[8]
Xiaoyu Du, Paul Ledwith, and Mark Scanlon. 2018. Deduplicated Disk Image Evidence Acquisition and Forensically-Sound Reconstruction. In 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE). IEEE, 1674--1679.
[9]
Peter Flach. 2012. Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press.
[10]
Simson L Garfinkel. 2010. Digital forensics research: The next 10 years. Digital Investigation 7 (2010), S64--S73.
[11]
Antonio Grillo, Alessandro Lentini, Gianluigi Me, and Matteo Ottoni. 2009. Fast user classifying to establish forensic analysis priorities. In IT Security Incident Management and IT Forensics, 2009. IMF'09. Fifth International Conference on. IEEE, 69--77.
[12]
Kristinn Guðjónsson. 2010. Mastering the super timeline with log2timeline. SANS Institute (2010).
[13]
Christopher Hargreaves and Jonathan Patterson. 2012. An automated timeline reconstruction approach for digital forensic investigations. Digital Investigation 9 (2012), S69--S79.
[14]
Ben Hitchcock, Nhien-An Le-Khac, and Mark Scanlon. 2016. Tiered forensic methodology model for Digital Field Triage by non-digital evidence specialists. Digital Investigation 16 (2016), S75--S85.
[15]
Ronald In de Braekt, Nhien-An Le-Khac, Jason Farina, Mark Scanlon, and Mohand-Tahar Kechadi. 2016. Increasing Digital Investigator Availability through Efficient Workflow Management and Automation. (04 2016), 68--73.
[16]
Bartosz Inglot, Lu Liu, and Nick Antonopoulos. 2012. A framework for enhanced timeline analysis in digital forensics. In 2012 IEEE International Conference on Green Computing and Communications. IEEE, 253--256.
[17]
Michael Donovan Kohn, Mariki M Eloff, and Jan HP Eloff. 2013. Integrated digital forensic process model. Computers & Security 38 (2013), 103--115.
[18]
Quan Le, Oisín Boydell, Brian Mac Namee, and Mark Scanlon. 2018. Deep learning at the shallow end: Malware classification for non-domain experts. Digital Investigation 26 (2018), S118--S126.
[19]
David Lillis, Brett Becker, Tadhg O'Sullivan, and Mark Scanlon. 2016. Current Challenges and Future Research Areas for Digital Forensic Investigation. In The 11th ADFSL Conference on Digital Forensics, Security and Law (CDFSL 2016). ADFSL, Daytona Beach, FL, USA, 9--20.
[20]
Fabio Marturana and Simone Tacconi. 2013. A Machine Learning-based Triage methodology for automated categorization of digital media. Digital Investigation 10, 2 (2013), 193--204.
[21]
Sebastian Neuner, Martin Mulazzani, Sebastian Schrittwieser, and Edgar Weippl. 2015. Gradually improving the forensic process. In 2015 10th International Conference on Availability, Reliability and Security. IEEE, 404--410.
[22]
Sriram Raghavan and SV Raghavan. 2013. Determining the origin of downloaded files using metadata associations. Journal of Communications 8, 12 (2013), 902--910.
[23]
Marcus K Rogers, James Goldman, Rick Mislan, Timothy Wedge, and Steve Debrota. 2006. Computer forensics field triage process model. Journal of Digital Forensics, Security and Law 1, 2 (2006), 2.
[24]
Neil C. Rowe and Simson L. Garfinkel. 2012. Finding Anomalous and Suspicious Files from Directory Metadata on a Large Corpus. In Digital Forensics and Cyber Crime, Pavel Gladyshev and Marcus K. Rogers (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 115--130.
[25]
Mark Scanlon. 2016. Battling the Digital Forensic Backlog through Data Deduplication. In Proceedings of the 6th IEEE International Conference on Innovative Computing Technologies (INTECH 2016). IEEE, Dublin, Ireland.
[26]
RB Van Baar, HMA Van Beek, and EJ van Eijk. 2014. Digital Forensics as a Service: A game changer. Digital Investigation 11 (2014), S54--S62.
[27]
HMA Van Beek, EJ van Eijk, RB van Baar, Mattijs Ugen, JNC Bodde, and AJ Siemelink. 2015. Digital forensics as a service: Game on. Digital Investigation 15 (2015), 20--38.
[28]
Kathryn Watkins, Mike McWhorte, Jeff Long, and Bill Hill. 2009. Teleporter: An analytically and forensically sound duplicate transfer system. Digital Investigation 6 (2009), S43--S47.
[29]
Shams Zawoad and Ragib Hasan. 2015. Digital forensics in the age of big data: Challenges, approaches, and opportunities. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. IEEE, 1320--1325.

Cited By

View all
  • (2024)Metadata-Based Detection of Child Sexual Abuse MaterialIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.332427521:4(3153-3164)Online publication date: Jul-2024
  • (2024)Enhancing Image Forensics with Transformer: A Multi-head Attention Approach for Robust Metadata AnalysisProceedings of Trends in Electronics and Health Informatics10.1007/978-981-97-3937-0_45(655-669)Online publication date: 17-Oct-2024
  • (2023)Machine-Learning Forensics: State of the Art in the Use of Machine-Learning Techniques for Digital Forensic Investigations within Smart EnvironmentsApplied Sciences10.3390/app13181016913:18(10169)Online publication date: 10-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '19: Proceedings of the 14th International Conference on Availability, Reliability and Security
August 2019
979 pages
ISBN:9781450371643
DOI:10.1145/3339252
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Artefact Relevancy
  2. Automatic Forensic Investigation
  3. Digital Forensics
  4. Machine Learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES '19

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)78
  • Downloads (Last 6 weeks)6
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Metadata-Based Detection of Child Sexual Abuse MaterialIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.332427521:4(3153-3164)Online publication date: Jul-2024
  • (2024)Enhancing Image Forensics with Transformer: A Multi-head Attention Approach for Robust Metadata AnalysisProceedings of Trends in Electronics and Health Informatics10.1007/978-981-97-3937-0_45(655-669)Online publication date: 17-Oct-2024
  • (2023)Machine-Learning Forensics: State of the Art in the Use of Machine-Learning Techniques for Digital Forensic Investigations within Smart EnvironmentsApplied Sciences10.3390/app13181016913:18(10169)Online publication date: 10-Sep-2023
  • (2023)Enhancing Digital Investigation: Leveraging ChatGPT for Evidence Identification and Analysis in Digital Forensics2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)10.1109/ICCCIS60361.2023.10425000(733-738)Online publication date: 3-Nov-2023
  • (2023)Enhancing Forensic Analysis Using a Machine Learning-based Approach2023 6th International Conference on Advanced Communication Technologies and Networking (CommNet)10.1109/CommNet60167.2023.10365260(1-6)Online publication date: 11-Dec-2023
  • (2023)SDOT: Secure Hash, Semantic Keyword Extraction, and Dynamic Operator Pattern-Based Three-Tier Forensic Classification FrameworkIEEE Access10.1109/ACCESS.2023.323443411(3291-3306)Online publication date: 2023
  • (2023)Artificial Intelligence in Digital ForensicsEncyclopedia of Forensic Sciences, Third Edition10.1016/B978-0-12-823677-2.00236-1(170-192)Online publication date: 2023
  • (2021)A survey of machine learning applications in digital forensicsTrends in Computer Science and Information Technology10.17352/tcsit.000034(020-024)Online publication date: 8-Apr-2021
  • (2020)Automated Artefact Relevancy Determination from Artefact Metadata and Associated Timeline Events2020 International Conference on Cyber Security and Protection of Digital Services (Cyber Security)10.1109/CyberSecurity49315.2020.9138874(1-8)Online publication date: Jun-2020
  • (2020)DeepUAge: Improving Underage Age Estimation Accuracy to Aid CSEM InvestigationForensic Science International: Digital Investigation10.1016/j.fsidi.2020.30092132(300921)Online publication date: Apr-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media