Véges et al., 2020 - Google Patents
Temporal smoothing for 3D human pose estimation and localization for occluded peopleVéges et al., 2020
View PDF- Document ID
- 13568775935158788852
- Author
- Véges M
- Lőrincz A
- Publication year
- Publication venue
- Neural Information Processing: 27th International Conference, ICONIP 2020, Bangkok, Thailand, November 23–27, 2020, Proceedings, Part I 27
External Links
Snippet
In multi-person pose estimation actors can be heavily occluded, even become fully invisible behind another person. While temporal methods can still predict a reasonable estimation for a temporarily disappeared pose using past and future frames, they exhibit large errors …
- 230000002123 temporal effect 0 title abstract description 29
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shu et al. | Feature-metric loss for self-supervised learning of depth and egomotion | |
Ming et al. | Deep learning for monocular depth estimation: A review | |
Cao et al. | Long-term human motion prediction with scene context | |
Girdhar et al. | Detect-and-track: Efficient pose estimation in videos | |
Véges et al. | Temporal smoothing for 3D human pose estimation and localization for occluded people | |
Zeng et al. | Deciwatch: A simple baseline for 10× efficient 2d and 3d pose estimation | |
Kim et al. | Towards sequence-level training for visual tracking | |
Zhang et al. | Key frame proposal network for efficient pose estimation in videos | |
Xue et al. | ECANet: Explicit cyclic attention-based network for video saliency prediction | |
Mocanu et al. | Deep-see face: A mobile face recognition system dedicated to visually impaired people | |
Saribas et al. | TRAT: Tracking by attention using spatio-temporal features | |
Berral-Soler et al. | RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild | |
Zhao et al. | Towards image-to-video translation: A structure-aware approach via multi-stage generative adversarial networks | |
Samadiani et al. | A multiple feature fusion framework for video emotion recognition in the wild | |
Zhang et al. | Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention | |
Véges et al. | Multi-person absolute 3D human pose estimation with weak depth supervision | |
Wang et al. | Efficient global-local memory for real-time instrument segmentation of robotic surgical video | |
Bao et al. | KalmanFlow 2.0: Efficient video optical flow estimation via context-aware kalman filtering | |
An et al. | ARShoe: Real-time augmented reality shoe try-on system on smartphones | |
Ni | Application of motion tracking technology in movies, television production and photography using big data | |
Jagadeesh et al. | Dynamic FERNet: Deep learning with optimal feature selection for face expression recognition in video | |
Xia et al. | Motion attention deep transfer network for cross-database micro-expression recognition | |
Li et al. | TSwinPose: Enhanced monocular 3D human pose estimation with JointFlow | |
Savran | Multi-timescale boosting for efficient and improved event camera face pose alignment | |
Jang et al. | MOVIN: Real‐time Motion Capture using a Single LiDAR |