Cai et al., 2023 - Google Patents
Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localizationCai et al., 2023
View HTML- Document ID
- 7946657451095948353
- Author
- Cai Z
- Ghosh S
- Dhall A
- Gedeon T
- Stefanov K
- Hayat M
- Publication year
- Publication venue
- Computer Vision and Image Understanding
External Links
Snippet
Most deepfake detection methods focus on detecting spatial and/or spatio-temporal changes in facial attributes and are centered around the binary classification task of detecting whether a video is real or fake. This is because available benchmark datasets contain …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Agnese et al. | A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis | |
Rana et al. | Deepfake detection: A systematic literature review | |
Ahmed et al. | Analysis survey on deepfake detection and recognition with convolutional neural networks | |
Munro et al. | Multi-modal domain adaptation for fine-grained action recognition | |
Xu et al. | Cross-modal relation-aware networks for audio-visual event localization | |
Botha et al. | Fake news and deepfakes: A dangerous threat for 21st century information security | |
Cheng et al. | Voice-face homogeneity tells deepfake | |
Guanghui et al. | Multi-modal emotion recognition by fusing correlation features of speech-visual | |
Cai et al. | Glitch in the matrix: A large scale benchmark for content driven audio–visual forgery detection and localization | |
Li et al. | Mec 2017: Multimodal emotion recognition challenge | |
Lewis et al. | Deepfake video detection based on spatial, spectral, and temporal inconsistencies using multimodal deep learning | |
JP2023537705A (en) | AUDIO-VISUAL EVENT IDENTIFICATION SYSTEM, METHOD AND PROGRAM | |
Ristea et al. | Emotion recognition system from speech and visual information based on convolutional neural networks | |
Zhang et al. | Multi-modal sentiment classification with independent and interactive knowledge via semi-supervised learning | |
Zhang | Voice keyword retrieval method using attention mechanism and multimodal information fusion | |
Altuncu et al. | Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review | |
CN112733764A (en) | Method for recognizing video emotion information based on multiple modes | |
Liu et al. | How to synthesize a large-scale and trainable micro-expression dataset? | |
Maiano et al. | Depthfake: a depth-based strategy for detecting deepfake videos | |
Liu et al. | Audio-visual temporal forgery detection using embedding-level fusion and multi-dimensional contrastive loss | |
KR102279772B1 (en) | Method and Apparatus for Generating Videos with The Arrow of Time | |
WO2024198438A1 (en) | Model training method, retrieval method, and related apparatuses | |
Nayak et al. | Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments | |
Berrahal et al. | A Comparative Analysis of Fake Image Detection in Generative Adversarial Networks and Variational Autoencoders | |
Nambiar et al. | Exploring the Power of Deep Learning for Seamless Background Audio Generation in Videos |