Benavent-Lledo et al., 2022 - Google Patents
Predicting human-object interactions in egocentric videosBenavent-Lledo et al., 2022
- Document ID
- 12191266182993679493
- Author
- Benavent-Lledo M
- Oprea S
- Castro-Vargas J
- Mulero-Perez D
- Garcia-Rodriguez J
- Publication year
- Publication venue
- 2022 International Joint Conference on Neural Networks (IJCNN)
External Links
Snippet
Egocentric videos provide a rich source of hand-object interactions that support action recognition. However, prior to action recognition, one may need to detect the presence of hands and objects in the scene. In this work, we propose an action estimation architecture …
- 230000003993 interaction 0 title abstract description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Predicting camera viewpoint improves cross-dataset generalization for 3d human pose estimation | |
Tian et al. | Densely connected attentional pyramid residual network for human pose estimation | |
Patil et al. | Real time facial expression recognition using RealSense camera and ANN | |
Saxena et al. | Comparison and analysis of image-to-image generative adversarial networks: a survey | |
Benavent-Lledo et al. | Predicting human-object interactions in egocentric videos | |
Yu | Emotion monitoring for preschool children based on face recognition and emotion recognition algorithms | |
Fu et al. | Deformer: Dynamic fusion transformer for robust hand pose estimation | |
Rani et al. | An effectual classical dance pose estimation and classification system employing convolution neural network–long shortterm memory (CNN-LSTM) network for video sequences | |
Truong et al. | Laban movement analysis and hidden Markov models for dynamic 3D gesture recognition | |
Callemein et al. | Automated analysis of eye-tracker-based human-human interaction studies | |
CN113902989A (en) | Live scene detection method, storage medium and electronic device | |
Balachandar et al. | Deep learning technique based visually impaired people using YOLO V3 framework mechanism | |
Angelopoulou et al. | Evaluation of different chrominance models in the detection and reconstruction of faces and hands using the growing neural gas network | |
Xu et al. | Video Object Segmentation: Tasks, Datasets, and Methods | |
Zimmer et al. | Imposing temporal consistency on deep monocular body shape and pose estimation | |
Jindal et al. | Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation | |
Ren et al. | Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer | |
Ajay et al. | Analyses of Machine Learning Techniques for Sign Language to Text conversion for Speech Impaired | |
Rawat et al. | Indian sign language recognition system for interrogative words using deep learning | |
Patra et al. | Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition | |
Geetha et al. | A review on human activity recognition system | |
Grzeszick | Partially supervised learning of models for visual scene and object recognition | |
Shaila et al. | Emotion estimation from nose feature using pyramid structure | |
Urano et al. | Human pose recognition by memory-based hierarchical feature matching | |
Liu | Attention Based Temporal Convolutional Neural Network for Real-Time 3D Human Pose Reconstruction |