Benavent-Lledo et al., 2022 - Google Patents

Predicting human-object interactions in egocentric videos

Benavent-Lledo et al., 2022

Document ID: 12191266182993679493
Author: Benavent-Lledo M; Oprea S; Castro-Vargas J; Mulero-Perez D; Garcia-Rodriguez J
Publication year: 2022
Publication venue: 2022 International Joint Conference on Neural Networks (IJCNN)

External Links

Cited by

Snippet

Egocentric videos provide a rich source of hand-object interactions that support action recognition. However, prior to action recognition, one may need to detect the presence of hands and objects in the scene. In this work, we propose an action estimation architecture …

Continue reading at ieeexplore.ieee.org (other versions)

230000003993 interaction 0 title abstract description 10

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run

Similar Documents

Publication	Publication Date	Title
Wang et al.	2020	Predicting camera viewpoint improves cross-dataset generalization for 3d human pose estimation
Tian et al.	2019	Densely connected attentional pyramid residual network for human pose estimation
Patil et al.	2016	Real time facial expression recognition using RealSense camera and ANN
Saxena et al.	2021	Comparison and analysis of image-to-image generative adversarial networks: a survey
Benavent-Lledo et al.	2022	Predicting human-object interactions in egocentric videos
Yu	2021	Emotion monitoring for preschool children based on face recognition and emotion recognition algorithms
Fu et al.	2023	Deformer: Dynamic fusion transformer for robust hand pose estimation
Rani et al.	2022	An effectual classical dance pose estimation and classification system employing convolution neural network–long shortterm memory (CNN-LSTM) network for video sequences
Truong et al.	2017	Laban movement analysis and hidden Markov models for dynamic 3D gesture recognition
Callemein et al.	2019	Automated analysis of eye-tracker-based human-human interaction studies
CN113902989A (en)	2022-01-07	Live scene detection method, storage medium and electronic device
Balachandar et al.	2021	Deep learning technique based visually impaired people using YOLO V3 framework mechanism
Angelopoulou et al.	2019	Evaluation of different chrominance models in the detection and reconstruction of faces and hands using the growing neural gas network
Xu et al.	2024	Video Object Segmentation: Tasks, Datasets, and Methods
Zimmer et al.	2023	Imposing temporal consistency on deep monocular body shape and pose estimation
Jindal et al.	2024	Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Ren et al.	2018	Toward three-dimensional human action recognition using a convolutional neural network with correctness-vigilant regularizer
Ajay et al.	2023	Analyses of Machine Learning Techniques for Sign Language to Text conversion for Speech Impaired
Rawat et al.	2023	Indian sign language recognition system for interrogative words using deep learning
Patra et al.	2024	Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition
Geetha et al.	2018	A review on human activity recognition system
Grzeszick	2018	Partially supervised learning of models for visual scene and object recognition
Shaila et al.	2023	Emotion estimation from nose feature using pyramid structure
Urano et al.	2004	Human pose recognition by memory-based hierarchical feature matching
Liu	2019	Attention Based Temporal Convolutional Neural Network for Real-Time 3D Human Pose Reconstruction