Cai et al., 2023 - Google Patents

Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localization

Cai et al., 2023

Document ID: 7946657451095948353
Author: Cai Z; Ghosh S; Dhall A; Gedeon T; Stefanov K; Hayat M
Publication year: 2023
Publication venue: Computer Vision and Image Understanding

External Links

Cited by

Snippet

Most deepfake detection methods focus on detecting spatial and/or spatio-temporal changes in facial attributes and are centered around the binary classification task of detecting whether a video is real or fake. This is because available benchmark datasets contain …

Continue reading at www.sciencedirect.com (HTML) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/01—Social networking
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce

Similar Documents

Publication	Publication Date	Title
Agnese et al.	2020	A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis
Rana et al.	2022	Deepfake detection: A systematic literature review
Ahmed et al.	2022	Analysis survey on deepfake detection and recognition with convolutional neural networks
Munro et al.	2020	Multi-modal domain adaptation for fine-grained action recognition
Xu et al.	2020	Cross-modal relation-aware networks for audio-visual event localization
Botha et al.	2020	Fake news and deepfakes: A dangerous threat for 21st century information security
Cheng et al.	2023	Voice-face homogeneity tells deepfake
Guanghui et al.	2021	Multi-modal emotion recognition by fusing correlation features of speech-visual
Cai et al.	2023	Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localization
Li et al.	2018	Mec 2017: Multimodal emotion recognition challenge
Lewis et al.	2020	Deepfake video detection based on spatial, spectral, and temporal inconsistencies using multimodal deep learning
JP2023537705A (en)	2023-09-05	AUDIO-VISUAL EVENT IDENTIFICATION SYSTEM, METHOD AND PROGRAM
Ristea et al.	2019	Emotion recognition system from speech and visual information based on convolutional neural networks
Zhang et al.	2020	Multi-modal sentiment classification with independent and interactive knowledge via semi-supervised learning
Zhang	2021	Voice keyword retrieval method using attention mechanism and multimodal information fusion
Altuncu et al.	2022	Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review
CN112733764A (en)	2021-04-30	Method for recognizing video emotion information based on multiple modes
Liu et al.	2022	How to synthesize a large-scale and trainable micro-expression dataset?
Maiano et al.	2022	Depthfake: a depth-based strategy for detecting deepfake videos
Liu et al.	2023	Audio-visual temporal forgery detection using embedding-level fusion and multi-dimensional contrastive loss
KR102279772B1 (en)	2021-07-19	Method and Apparatus for Generating Videos with The Arrow of Time
WO2024198438A1 (en)	2024-10-03	Model training method, retrieval method, and related apparatuses
Nayak et al.	2013	Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments
Berrahal et al.	2023	A Comparative Analysis of Fake Image Detection in Generative Adversarial Networks and Variational Autoencoders
Nambiar et al.	2023	Exploring the Power of Deep Learning for Seamless Background Audio Generation in Videos