SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation

Mengxi Jiang¹,
Zhuliang Yu²,
Cuihua Li¹ &
…
Yunqi Lei¹

465 Accesses
3 Citations
Explore all metrics

Abstract

Recovering the 3D human pose from a single image with 2D joints is a challenging task in computer vision applications. The sparse representation (SR) model has been successfully adopted in 3D pose estimation approaches. However, since existing available training 3D data are often collected in a constrained environment (i.e., indoor) with limited diversity of subjects and actions, most SR-based approaches would have a lower generalization to real-world scenarios that may contain more complex cases. To alleviate this issue, this paper proposes SDM3d, a novel shape decomposition using multiple geometric priors for 3D pose estimation. SDM3d makes a new attempt by separating a 3D pose into the global structure and body deformations that are encoded explicitly via different priors constraints. Furthermore, a joint learning strategy is designed to learn two over-complete dictionaries from training data to capture more geometric priors information. We have evaluated SDM3d on four well-recognized benchmarks, i.e., Human3.6M, HumanEva-I, CMU MoCap, and MPII. The experiment results show the effectiveness of SDM3d.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Joint Points and Silhouette-Based Method to Estimate 3D Human Pose and Shape

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Speed-Up 3D Human Pose Estimation Task Using Sub-spacing Approach

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Agudo A, Moreno-Noguer F (2017) Force-based representation for non-rigid shape and elastic model estimation. IEEE Trans Pattern Anal Mach Intell 40(9):2137–2150
Article Google Scholar
Akhter I, Black MJ (2015) Pose-conditioned joint angle limits for 3D human pose reconstruction. In: Computer vision and pattern recognition, pp 1446–1455
Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2D human pose estimation: new benchmark and state of the art analysis. In: Computer vision and pattern recognition, pp 3686–3693
Bo L, Sminchisescu C (2010) Twin gaussian processes for structured prediction. Int J Comput Vis 87(1–2):28–52
Article Google Scholar
Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J, Black MJ (2016) Keep it smpl: automatic estimation of 3D human pose and shape from a single image. In: European conference on computer vision, pp 561–578
Boumal N, Mishra B, Absil PA, Sepulchre R (2013) Manopt, a matlab toolbox for optimization on manifolds. J Mach Learn Res 15(1):1455–1459
MATH Google Scholar
Boyd SP, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn Arch 3(1):1–122
MATH Google Scholar
Candes EJ, Tao T (2006) Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans Inform Theory 52(12):5406–5425
Article MathSciNet Google Scholar
Cao W, Yang Z, Ren X, Lyu L, Zhang B, Zhang Y, Wu E (2019) An improved solution for deformation simulation of nonorthotropic geometric models. Comput Anim Virtual Worlds 31:e1915
Google Scholar
Chen CH, Ramanan D (2017) 3D human pose estimation = 2D pose estimation+ matching. In: Computer vision and pattern recognition, pp 5759–5767
Chen W, Wang H, Li Y, Su H, Wang Z, Tu C, Lischinski D, Cohen-Or D, Chen B (2016) Synthesizing training images for boosting human 3D pose estimation. In: International conference on 3d vision, pp 479–488
CMU (2014) Mocap: Carnegie mellon university motion capture database. http://mocap.cs.cmu.edu/
Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59
Article Google Scholar
Dai Y, Li H, He M (2012) A simple prior-free method for non-rigid structure-from-motion factorization. In: CVPR, pp 2018–2025
Du Y, Wong Y, Liu Y, Han F, Gui Y, Wang Z, Kankanhalli M, Geng W (2016) Marker-less 3D human motion capture with monocular image sequence and height-maps. In: European conference on computer vision, pp 20–36
Ehlers K, Brama K (2016) A human-robot interaction interface for mobile and stationary robots based on real-time 3d human body and hand-finger pose estimation. In: IEEE international conference on emerging technologies and factory automation, pp 1–6
Fan X, Zheng K, Zhou Y, Wang S (2014) Pose locality constrained representation for 3D human pose reconstruction. In: European conference on computer vision, pp 174–188
Hachiuma R, Saito H (2016) Recognition and pose estimation of primitive shapes from depth images for spatial augmented reality. In: 2016 IEEE 2nd workshop on everyday virtual reality, pp 32–35
Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36(7):1325–1339
Article Google Scholar
Jiang H (2010) 3D human pose reconstruction using millions of exemplars. In: International conference on pattern recognition, pp 1674–1677
Jiang M, Yu Z, Zhang Y, Wang Q, Li C, Lei Y (2019) Reweighted sparse representation with residual compensation for 3d human pose estimation from a single rgb image. Neurocomputing 358(C):332–343
Article Google Scholar
Katircioglu I, Tekin B, Salzmann M, Lepetit V, Fua P (2018) Learning latent representations of 3D human pose with deep neural networks. Int J Comput Vis 126(12):1–16
Article Google Scholar
Kostrikov I, Gall J (2014) Depth sweep regression forests for estimating 3D human pose from images. In: British machine vision conference, pp 1–13
Lawrence ND, Moore AJ (2007) Hierarchical gaussian process latent variable models. In: International conference on machine learning, pp 481–488
Li S, Zhang W, Chan AB (2017) Maximum-margin structured learning with deep networks for 3D human pose estimation. Int J Comput Vis 122(1):149–168
Article MathSciNet Google Scholar
Lin M, Liang L, Liang X, Wang K, Hui C, Lin M, Liang L, Liang X, Wang K, Hui C (2017) Recurrent 3D pose sequence machines. In: Computer vision and pattern recognition, pp 5543–5552
Liu Z, Song X, Tang Z (2015) Fusing hierarchical multi-scale local binary patterns and virtual mirror samples to perform face recognition. Neural Comput Appl 26(8):2013–2026
Article Google Scholar
Lv Z (2019) Robust3d: a robust 3d face reconstruction application. In: Neural computing and applications, pp 1–8
Martinez J, Hossain R, Romero J, Little JJ (2017) A simple yet effective baseline for 3D human pose estimation. In: International conference on computer vision, pp 2659–2668
Mehta D, Rhodin H, Casas D, Fua P, Sotnychenko O, Xu W, Theobalt C (2017) Monocular 3d human pose estimation in the wild using improved cnn supervision, pp 506–516
Morenonoguer F (2017) 3D human pose estimation from a single image via distance matrix regression. In: Computer vision and pattern recognition, pp 1561–1570
Morozov AA, Sushkova OS, Polupanov AF (2017) Object-oriented logic programming of 3d intelligent video surveillance: the problem statement. In: IEEE 26th international symposium on industrial electronics, pp 1631–1636
Nesterov Yu (2013) Gradient methods for minimizing composite functions. Math Program 140(1):125–161
Article MathSciNet Google Scholar
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, pp 483–499
Olshausen BA, Field DJ (1997) Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis Res 37(23):3311–3325
Article Google Scholar
Park D, Ramanan D (2015) Articulated pose estimation with tiny synthetic videos. In: Computer vision and pattern recognition workshops, pp 58–66
Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3D human pose. In: Computer vision and pattern recognition, pp 1263–1272
Pishchulin L, Jain A, Andriluka M, Thormählen T, Schiele B (2012) Articulated people detection and pose estimation: reshaping the future. In: Computer vision and pattern recognition, pp 3178–3185
Radwan I, Dhall A, Goecke R (2013) Monocular image 3D human pose estimation under self-occlusion. In: International conference on computer vision, pp 1888–1895
Ramakrishna V, Kanade T, Sheikh Y (2012) Reconstructing 3D human pose from 2D image landmarks. In: European conference on computer vision, pp 573–586
Sanzari M, Ntouskos V, Pirri F (2016) Bayesian image based 3D pose estimation. In: European conference on computer vision, pp 566–582
Sarafianos N, Boteanu B, Ionescu B, Kakadiaris IA (2016) 3D human pose estimation: a review of the literature and analysis of covariates. Comput Vis Image Underst 152:1–20
Article Google Scholar
Sedai S, Bennamoun M, Huynh DQ (2013) Discriminative fusion of shape and appearance features for human pose estimation. Pattern Recognit 46(12):3223–3237
Article Google Scholar
Shao Y, Nong S, Gao C, Li M (2018) Spatial and class structure regularized sparse representation graph for semi-supervised hyperspectral image classification. Pattern Recognit 81:102–114
Article Google Scholar
Sigal L, Black MJ (2006) Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion. Int J Comput Vis 87(1–2):4–27
Google Scholar
Sigal L, Memisevic R, Fleet DJ (2009) Shared kernel information embedding for discriminative inference. In: Computer vision and pattern recognition, pp 2852–2859
Simo-Serra E, Quattoni A, Torras C, Moreno-Noguer F (2013) A joint model for 2D and 3D pose estimation from a single image. In: Computer vision and pattern recognition, pp 3634–3641
Simo-Serra E, Ramisa A, Alenyà G, Torras C (2012) Single image 3D human pose estimation from noisy observations. In: Computer vision and pattern recognition, pp 2673–2680
Sminchisescu C, Jepson A (2004) Generative modeling for continuous non-linearly embedded visual inference. In: International conference on machine learning
Tekin B, Katircioglu I, Salzmann M, Lepetit V, Fua P (2016) Structured prediction of 3D human pose with deep neural networks. arXiv:1605.05180
Tekin B, Rozantsev A, Lepetit V, Fua P (2016) Direct prediction of 3D body poses from motion compensated sequences. In: Computer vision and pattern recognition, pp 991–1000
Varol G, Romero J, Martin X, Mahmood N, Black M, Laptev I, Schmid C (2017) Learning from synthetic humans. In: Computer vision and pattern recognition, pp 4627–4635
Wang C, Wang Y, Lin Z, Yuille A (2019) Robust 3D human pose estimation from single images or video sequences. IEEE Trans Pattern Anal Mach Intell 41(5):1227–1241
Article Google Scholar
Wang C, Wang Y, Lin Z, Yuille AL, Gao W (2014) Robust estimation of 3D human poses from a single image. In: Computer vision and pattern recognition, pp 2369–2376
Wang K, Lin L, Jiang C, Qian C, Wei P (2019) 3D human pose machines with self-supervised learning. In: IEEE transactions on pattern analysis and machine intelligence, p 1
Wei SE, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In: Computer vision and pattern recognition, pp 4724–4732
Yang X, Sun Q, Wang T (2019) No-reference image quality assessment based on sparse representation. Neural Comput Appl 31(10):6643–6658
Article Google Scholar
Yang Y, Ramanan D (2011) Articulated pose estimation with flexible mixtures-of-parts. In: Computer vision and pattern recognition, pp 1385–1392
Yang Z, Tang L, Zhang K, Wong PK (2018) Multi-view cnn feature aggregation with elm auto-encoder for 3d shape recognition. Cognit Comput 10(6):908–921
Article Google Scholar
Yasin H, Iqbal U, Krüger B, Weber A, Gall J (2016) A dual-source approach for 3D pose estimation from a single image. In: Computer vision and pattern recognition, pp 4948–4956
Zeng S, Gou J, Yang X (2018) Improving sparsity of coefficients for robust sparse and collaborative representation-based image classification. Neural Comput Appl 30(10):2965–2978
Article Google Scholar
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: International conference on computer vision, pp 471–478
Zhou X, Huang Q, Sun X, Xue X, Wei Y (2017) Towards 3D human pose estimation in the wild: a weakly-supervised approach. In: International conference on computer vision, pp 398–407
Zhou X, Leonardos S, Hu X, Daniilidis K (2015) 3D shape estimation from 2d landmarks: a convex relaxation approach. In: Computer vision and pattern recognition, pp 4447–4455
Zhou X, Sun X, Zhang W, Liang S, Wei Y (2016) Deep kinematic pose regression. In: European conference on computer vision, pp 186–201
Zhou X, Zhu M, Leonardos S, Daniilidis K (2017) Sparse representation for 3D shape estimation: a convex relaxation approach. IEEE Trans Pattern Anal Mach Intell 39(8):1648–1661
Article Google Scholar
Zhou X, Zhu M, Leonardos S, Derpanis KG, Daniilidis K (2016) Sparseness meets deepness: 3D human pose estimation from monocular video. In: Computer vision and pattern recognition, pp 4966–4975
Zhou X, Zhu M, Pavlakos G, Leonardos S, Derpanis KG, Daniilidis K (2019) Monocap: monocular human motion capture using a cnn coupled with a geometric prior. IEEE Trans Pattern Anal Mach Intell 41(4):901–914
Article Google Scholar

Download references

Acknowledgements

This research was supported by the National Nature Science Foundation of China (Grant No. 61671397).

Author information

Authors and Affiliations

Department of Computer Science, Xiamen University, Xiamen, 361005, China
Mengxi Jiang, Cuihua Li & Yunqi Lei
College of Automation Science and Engineering, South China University of Technology, Guangzhou, 510640, China
Zhuliang Yu

Authors

Mengxi Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuliang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Cuihua Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunqi Lei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunqi Lei.

Ethics declarations

Conflict of interest

No conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, M., Yu, Z., Li, C. et al. SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation. Neural Comput & Applic 33, 2165–2181 (2021). https://doi.org/10.1007/s00521-020-05086-0

Download citation

Received: 12 November 2019
Accepted: 04 June 2020
Published: 24 June 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00521-020-05086-0

SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Joint Points and Silhouette-Based Method to Estimate 3D Human Pose and Shape

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Speed-Up 3D Human Pose Estimation Task Using Sub-spacing Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Joint Points and Silhouette-Based Method to Estimate 3D Human Pose and Shape

Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Speed-Up 3D Human Pose Estimation Task Using Sub-spacing Approach

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation