Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.06583 (cs)

[Submitted on 11 Dec 2023 (v1), last revised 23 Sep 2024 (this version, v2)]

Title:3D Hand Pose Estimation in Everyday Egocentric Images

Authors:Aditya Prakash, Ruisen Tu, Matthew Chang, Saurabh Gupta

Abstract:3D hand pose estimation in everyday egocentric images is challenging for several reasons: poor visual signal (occlusion from the object of interaction, low resolution & motion blur), large perspective distortion (hands are close to the camera), and lack of 3D annotations outside of controlled settings. While existing methods often use hand crops as input to focus on fine-grained visual information to deal with poor visual signal, the challenges arising from perspective distortion and lack of 3D annotations in the wild have not been systematically studied. We focus on this gap and explore the impact of different practices, i.e. crops as input, incorporating camera information, auxiliary supervision, scaling up datasets. We provide several insights that are applicable to both convolutional and transformer models leading to better performance. Based on our findings, we also present WildHands, a system for 3D hand pose estimation in everyday egocentric images. Zero-shot evaluation on 4 diverse datasets (H2O, AssemblyHands, Epic-Kitchens, Ego-Exo4D) demonstrate the effectiveness of our approach across 2D and 3D metrics, where we beat past methods by 7.4% - 66%. In system level comparisons, WildHands achieves the best 3D hand pose on ARCTIC egocentric split, outperforms FrankMocap across all metrics and HaMeR on 3 out of 6 metrics while being 10x smaller and trained on 5x less data.

Comments:	ECCV 2024, Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:2312.06583 [cs.CV]
	(or arXiv:2312.06583v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.06583

Submission history

From: Aditya Prakash [view email]
[v1] Mon, 11 Dec 2023 18:15:47 UTC (12,073 KB)
[v2] Mon, 23 Sep 2024 14:32:08 UTC (1,938 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:3D Hand Pose Estimation in Everyday Egocentric Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:3D Hand Pose Estimation in Everyday Egocentric Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators