Zhang et al., 2018 - Google Patents
Video saliency prediction based on spatial-temporal two-stream networkZhang et al., 2018
- Document ID
- 9454016180501420119
- Author
- Zhang K
- Chen Z
- Publication year
- Publication venue
- IEEE Transactions on Circuits and Systems for Video Technology
External Links
Snippet
In this paper, we propose a novel two-stream neural network for video saliency prediction. Unlike some traditional methods based on hand-crafted feature extraction and integration, our proposed method automatically learns saliency related spatiotemporal features from …
- 230000002123 temporal effect 0 abstract description 73
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00597—Acquiring or recognising eyes, e.g. iris verification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Video saliency prediction based on spatial-temporal two-stream network | |
Lai et al. | Video saliency prediction using spatiotemporal residual attentive networks | |
Wang et al. | Revisiting video saliency prediction in the deep learning era | |
Wang et al. | A deep network solution for attention and aesthetics aware photo cropping | |
Yi et al. | Audio-driven talking face video generation with learning-based personalized head pose | |
Li et al. | Occlusion aware facial expression recognition using CNN with attention mechanism | |
Sun et al. | Models matter, so does training: An empirical study of cnns for optical flow estimation | |
Wang et al. | Paying attention to video object pattern understanding | |
Wang et al. | Learning unsupervised video object segmentation through visual attention | |
Yang et al. | A dilated inception network for visual saliency prediction | |
Zhang et al. | Facial expression analysis under partial occlusion: A survey | |
Li et al. | Deep learning for micro-expression recognition: A survey | |
JP7476428B2 (en) | Image line of sight correction method, device, electronic device, computer-readable storage medium, and computer program | |
Zhou et al. | Salient region detection via integrating diffusion-based compactness and local contrast | |
Hu et al. | Global-local enhancement network for NMF-aware sign language recognition | |
Li et al. | Distortion-Adaptive Salient Object Detection in 360$^\circ $ Omnidirectional Images | |
Li et al. | Visual saliency computation: A machine learning perspective | |
Li et al. | Constrained fixation point based segmentation via deep neural network | |
Xie et al. | An overview of facial micro-expression analysis: Data, methodology and challenge | |
Thomas et al. | Perceptual video summarization—A new framework for video summarization | |
Chen et al. | Video saliency prediction using enhanced spatiotemporal alignment network | |
Liu et al. | Instance-level relative saliency ranking with graph reasoning | |
Zhou et al. | SignBERT: a BERT-based deep learning framework for continuous sign language recognition | |
Fang et al. | Visual attention prediction for stereoscopic video by multi-module fully convolutional network | |
Zhang et al. | A spatial-temporal recurrent neural network for video saliency prediction |