Nothing Special   »   [go: up one dir, main page]

Cai et al., 2023 - Google Patents

Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localization

Cai et al., 2023

View HTML
Document ID
7946657451095948353
Author
Cai Z
Ghosh S
Dhall A
Gedeon T
Stefanov K
Hayat M
Publication year
Publication venue
Computer Vision and Image Understanding

External Links

Snippet

Most deepfake detection methods focus on detecting spatial and/or spatio-temporal changes in facial attributes and are centered around the binary classification task of detecting whether a video is real or fake. This is because available benchmark datasets contain …
Continue reading at www.sciencedirect.com (HTML) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce, e.g. shopping or e-commerce

Similar Documents

Publication Publication Date Title
Agnese et al. A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis
Rana et al. Deepfake detection: A systematic literature review
Ahmed et al. Analysis survey on deepfake detection and recognition with convolutional neural networks
Munro et al. Multi-modal domain adaptation for fine-grained action recognition
Xu et al. Cross-modal relation-aware networks for audio-visual event localization
Botha et al. Fake news and deepfakes: A dangerous threat for 21st century information security
Cheng et al. Voice-face homogeneity tells deepfake
Guanghui et al. Multi-modal emotion recognition by fusing correlation features of speech-visual
Cai et al. Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localization
Li et al. Mec 2017: Multimodal emotion recognition challenge
Lewis et al. Deepfake video detection based on spatial, spectral, and temporal inconsistencies using multimodal deep learning
JP2023537705A (en) AUDIO-VISUAL EVENT IDENTIFICATION SYSTEM, METHOD AND PROGRAM
Ristea et al. Emotion recognition system from speech and visual information based on convolutional neural networks
Zhang et al. Multi-modal sentiment classification with independent and interactive knowledge via semi-supervised learning
Zhang Voice keyword retrieval method using attention mechanism and multimodal information fusion
Altuncu et al. Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review
CN112733764A (en) Method for recognizing video emotion information based on multiple modes
Liu et al. How to synthesize a large-scale and trainable micro-expression dataset?
Maiano et al. Depthfake: a depth-based strategy for detecting deepfake videos
Liu et al. Audio-visual temporal forgery detection using embedding-level fusion and multi-dimensional contrastive loss
KR102279772B1 (en) Method and Apparatus for Generating Videos with The Arrow of Time
WO2024198438A1 (en) Model training method, retrieval method, and related apparatuses
Nayak et al. Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments
Berrahal et al. A Comparative Analysis of Fake Image Detection in Generative Adversarial Networks and Variational Autoencoders
Nambiar et al. Exploring the Power of Deep Learning for Seamless Background Audio Generation in Videos