Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Audio Recapture Detection With Convolutional Neural Networks

Published: 01 August 2016 Publication History

Abstract

In this paper, we investigate how features can be effectively learned by deep neural networks for audio forensic problems. By providing a preliminary feature preprocessing based on electric network frequency (ENF) analysis, we propose a convolutional neural network (CNN) for training and classification of genuine and recaptured audio recordings. Hierarchical representations which contain levels of details of the ENF components are learned from the deep neural networks and can be used for further classification. The proposed method works for small audio clips of 2 second duration, whereas the state of the art may fail with such small audio clips. Experimental results demonstrate that the proposed network yields high detection accuracy with each ENF harmonic component represented as a single-channel input. The performance can be further improved by a combined input representation which incorporates both the fundamental ENF and its harmonics. The convergence property of the network and the effect of using an analysis window with various sizes are also studied. Performance comparison against the support tensor machine demonstrates the advantage of using CNN for the task of audio recapture detection. Moreover, visualization of the intermediate feature maps provides some insight into what the deep neural networks actually learn and how they make decisions.

Cited By

View all
  • (2024)Deletion and insertion tampering detection for speech authentication based on fluctuating super vector of electrical network frequencySpeech Communication10.1016/j.specom.2024.103046158:COnline publication date: 1-Mar-2024
  • (2024)CDPNet: conformer-based dual path joint modeling network for bird sound recognitionApplied Intelligence10.1007/s10489-024-05362-954:4(3152-3168)Online publication date: 1-Feb-2024
  • (2022)Audio Tampering Forensics Based on Representation Learning of ENF Phase SequenceInternational Journal of Digital Crime and Forensics10.4018/IJDCF.30289414:1(1-19)Online publication date: 10-Jun-2022
  • Show More Cited By
  1. Audio Recapture Detection With Convolutional Neural Networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Multimedia
    IEEE Transactions on Multimedia  Volume 18, Issue 8
    August 2016
    222 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 August 2016

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Deletion and insertion tampering detection for speech authentication based on fluctuating super vector of electrical network frequencySpeech Communication10.1016/j.specom.2024.103046158:COnline publication date: 1-Mar-2024
    • (2024)CDPNet: conformer-based dual path joint modeling network for bird sound recognitionApplied Intelligence10.1007/s10489-024-05362-954:4(3152-3168)Online publication date: 1-Feb-2024
    • (2022)Audio Tampering Forensics Based on Representation Learning of ENF Phase SequenceInternational Journal of Digital Crime and Forensics10.4018/IJDCF.30289414:1(1-19)Online publication date: 10-Jun-2022
    • (2021)Antiforensics of Speech Resampling Using Dual-Path StrategyWireless Communications & Mobile Computing10.1155/2021/66401062021Online publication date: 1-Jan-2021
    • (2020)Detection of Voice Transformation Disguise Based on Deep Residual NetProceedings of the 2020 4th International Conference on Cryptography, Security and Privacy10.1145/3377644.3377645(126-130)Online publication date: 10-Jan-2020
    • (2020)Identification of VoIP Speech With Multiple Domain Deep FeaturesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2019.296063515(2253-2267)Online publication date: 7-Feb-2020
    • (2019)Deep Multi-scale Discriminative Networks for Double JPEG Compression ForensicsACM Transactions on Intelligent Systems and Technology10.1145/330127410:2(1-20)Online publication date: 15-Feb-2019
    • (2019)Auxiliary Classifier Generative Adversarial Network With Soft Labels in Imbalanced Acoustic Event DetectionIEEE Transactions on Multimedia10.1109/TMM.2018.287975021:6(1359-1371)Online publication date: 22-May-2019
    • (2019)Statistical Model-Based Detector via Texture Weight Map: Application in Re-Sampling AuthenticationIEEE Transactions on Multimedia10.1109/TMM.2018.287286321:5(1077-1092)Online publication date: 23-Apr-2019
    • (2019)Enhancing unsupervised neural networks based text summarization with word embedding and ensemble learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2019.01.037123:C(195-211)Online publication date: 1-Jun-2019
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media