Nothing Special   »   [go: up one dir, main page]

skip to main content
survey

Deep Learning-based Human Pose Estimation: A Survey

Published: 26 August 2023 Publication History

Abstract

Human pose estimation aims to locate the human body parts and build human body representation (e.g., body skeleton) from input data such as images and videos. It has drawn increasing attention during the past decade and has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality. Although the recently developed deep learning-based solutions have achieved high performance in human pose estimation, there still remain challenges due to insufficient training data, depth ambiguities, and occlusion. The goal of this survey article is to provide a comprehensive review of recent deep learning-based solutions for both 2D and 3D pose estimation via a systematic analysis and comparison of these solutions based on their input data and inference procedures. More than 260 research papers since 2014 are covered in this survey. Furthermore, 2D and 3D human pose estimation datasets and evaluation metrics are included. Quantitative performance comparisons of the reviewed methods on popular datasets are summarized and discussed. Finally, the challenges involved, applications, and future research directions are concluded. A regularly updated project page is provided: https://github.com/zczcwh/DL-HPE.

References

[1]
M. Andriluka, U. Iqbal, E. Ensafutdinov, L. Pishchulin, A. Milan, J. Gall, and B. Schiele.2018. PoseTrack: A benchmark for human pose estimation and tracking. In CVPR.
[2]
Mykhaylo Andriluka, Umar Iqbal, Eldar Insafutdinov, Leonid Pishchulin, Anton Milan, Juergen Gall, and Bernt Schiele. 2018. PoseTrack: A benchmark for human pose estimation and tracking. In CVPR.
[3]
Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2014. 2D human pose estimation: New benchmark and state of the art analysis. In CVPR.
[4]
Federico Angelini, Zeyu Fu, Yang Long, Ling Shao, and Syed Mohsen Naqvi. 2018. ActionXPose: A novel 2D multi-view pose-based algorithm for real-time human action recognition. arXiv preprint arXiv:1810.12126.
[5]
Anurag Arnab, Carl Doersch, and Andrew Zisserman. 2019. Exploiting temporal context for 3D human pose estimation in the wild. In CVPR.
[6]
Bruno Artacho and Andreas Savakis. 2020. UniPose: Unified human pose estimation in single images and videos. In CVPR.
[7]
Vasileios Belagiannis and Andrew Zisserman. 2017. Recurrent human pose estimation. In FG.
[8]
Abdallah Benzine, Florian Chabot, Bertrand Luvison, Quoc Cuong Pham, and Catherine Achard. 2020. PandaNet: Anchor-based single-shot multi-person 3D pose estimation. In CVPR.
[9]
Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, and Lorenzo Torresani. 2019. Learning temporal pose estimation from sparsely-labeled videos. In NeurIPS.
[10]
Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In ECCV.
[11]
Adrian Bulat and Georgios Tzimiropoulos. 2016. Human pose estimation via convolutional part heatmap regression. In ECCV.
[12]
M. Burenius, J. Sullivan, and S. Carlsson. 2013. 3D Pictorial structures for multiple view articulated pose estimation. In CVPR.
[13]
Y. Cai, L. Ge, J. Liu, J. Cai, T. Cham, J. Yuan, and N. M. Thalmann. 2019. Exploiting spatial-temporal relationships for 3D pose estimation via graph convolutional networks. In ICCV.
[14]
Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xinyu Zhou, Erjin Zhou, Xiangyu Zhang, and Jian Sun. 2020. Learning delicate local representations for multi-person pose estimation. arXiv preprint arXiv:2003.04030.
[15]
Zhe Cao, Hang Gao, Karttikeya Mangalam, Qizhi Cai, Minh Vo, and Jitendra Malik. 2020. Long-term human motion prediction with scene context. In ECCV.
[16]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2D pose estimation using part affinity fields. In CVPR.
[17]
Joao Carreira, Pulkit Agrawal, Katerina Fragkiadaki, and Jitendra Malik. 2016. Human pose estimation with iterative error feedback. In CVPR.
[18]
Ching-Hang Chen and Deva Ramanan. 2017. 3D Human pose estimation = 2D pose estimation + matching. In CVPR.
[19]
Ching-Hang Chen, Ambrish Tyagi, Amit Agrawal, Dylan Drover, Rohith Mv, Stefan Stojanov, and James M. Rehg. 2019. Unsupervised 3D pose estimation with geometric self-supervision. In CVPR.
[20]
He Chen, Pengfei Guo, Pengfei Li, Gim Hee Lee, and Gregory Chirikjian. 2020. Multi-person 3D pose estimation in crowded scenes based on multi-view geometry. In ECCV.
[21]
Kenny Chen, Paolo Gabriel, Abdulwahab Alasfour, Chenghao Gong, Werner K. Doyle, Orrin Devinsky, Daniel Friedman, Patricia Dugan, Lucia Melloni, Thomas Thesen et al. 2018. Patient-specific pose estimation in clinical environments. In IEEE J. Translat. Eng. Health Med. Vol. 6, 1–11.
[22]
Long Chen, Haizhou Ai, Rui Chen, Zijie Zhuang, and Shuang Liu. 2020. Cross-view tracking for multi-human 3D pose estimation at over 100 FPS. In CVPR.
[23]
Tianlang Chen, Chen Fang, Xiaohui Shen, Yiheng Zhu, Zhili Chen, and Jiebo Luo. 2021. Anatomy-aware 3D human pose estimation with bone-based pose decomposition. In IEEE Trans. Circ. Syst. Vid. Technol, Vol. 32, 198–209.
[24]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709.
[25]
Weiming Chen, Zijie Jiang, Hailin Guo, and Xiaoyang Ni. 2020. Fall detection based on key points of human-skeleton using OpenPose. In Symmetry, Vol. 12, 744.
[26]
Xianjie Chen and Alan L. Yuille. 2014. Articulated pose estimation by a graphical model with image dependent pairwise relations. In NeurIPS.
[27]
Yu Chen, Chunhua Shen, Xiu-Shen Wei, Lingqiao Liu, and Jian Yang. 2017. Adversarial PoseNet: A structure-aware convolutional network for human pose estimation. In ICCV.
[28]
Yucheng Chen, Yingli Tian, and Mingyi He. 2020. Monocular human pose estimation: A survey of deep learning-based methods. In Comput. Vis. Image Underst. Vol. 192, 102897.
[29]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018. Cascaded pyramid network for multi-person pose estimation. In CVPR.
[30]
Zerui Chen, Yan Huang, Hongyuan Yu, Bin Xue, Ke Han, Yiru Guo, and Liang Wang. 2020. Towards part-aware monocular 3D human pose estimation: An architecture search approach. In ECCV.
[31]
Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S. Huang, and Lei Zhang. 2020. HigherHRNet: Scale-aware representation learning for bottom-up human pose estimation. In CVPR.
[32]
Yu Cheng, Bo Wang, Bo Yang, and Robby T. Tan. 2021. Graph and temporal convolutional networks for 3D multi-person pose estimation in monocular videos. In AAAI.
[33]
Yu Cheng, Bo Wang, Bo Yang, and Robby T. Tan. 2021. Monocular 3D multi-person pose estimation by integrating top-down and bottom-up networks. In CVPR.
[34]
Y. Cheng, B. Yang, B. Wang, Y. Wending, and R. Tan. 2019. Occlusion-aware networks for 3D human pose estimation in video. In ICCV.
[35]
Yu Cheng, Bo Yang, Bo Wang, Wending Yan, and Robby T. Tan. 2019. Occlusion-aware networks for 3D human pose estimation in video. In ICCV.
[36]
Junhyeong Cho, Kim Youwang, and Tae-Hyun Oh. 2022. Cross-attention of disentangled modalities for 3D human mesh recovery with transformers. In ECCV.
[37]
Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, and Kyoung Mu Lee. 2021. Beyond static features for temporally consistent 3D human pose and shape from a video. In CVPR.
[38]
Hongsuk Choi, Gyeongsik Moon, and Kyoung Mu Lee. 2020. Pose2Mesh: Graph convolutional network for 3D human pose and mesh recovery from a 2D human pose. In ECCV.
[39]
Chia-Jung Chou, Jui-Ting Chien, and Hwann-Tzong Chen. 2018. Self adversarial training for human pose estimation. In APSIPA ASC.
[40]
Xiao Chu, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2016. Structured feature learning for pose estimation. In CVPR.
[41]
Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L. Yuille, and Xiaogang Wang. 2017. Multi-context attention for human pose estimation. In CVPR.
[42]
H. Ci, C. Wang, X. Ma, and Y. Wang. 2019. Optimizing network structure for 3D human pose estimation. In ICCV.
[43]
Henry M. Clever, Zackory Erickson, Ariel Kapusta, Greg Turk, Karen Liu, and Charles C. Kemp. 2020. Bodies at rest: 3D human pose and shape estimation from a pressure image using synthetic data. In CVPR.
[44]
Rishabh Dabral, Anurag Mundhada, Uday Kusupati, Safeer Afaque, Abhishek Sharma, and Arjun Jain. 2018. Learning 3D human pose from structure and motion. In ECCV.
[45]
Srijan Das, Saurav Sharma, Rui Dai, François Brémond, and Monique Thonnat. 2020. VPN: Learning video-pose embedding for activities of daily living. In ECCV.
[46]
Bappaditya Debnath, Mary O’Brien, Motonori Yamaguchi, and Ardhendu Behera. 2018. Adapting MobileNets for mobile based upper body pose estimation. In AVSS.
[47]
Andreas Doering, Umar Iqbal, and Juergen Gall. 2018. Joint flow: Temporal flow fields for multi person tracking. arXiv preprint arXiv:1805.04596.
[48]
Junting Dong, Wen Jiang, Qixing Huang, Hujun Bao, and Xiaowei Zhou. 2019. Fast and robust multi-person 3D pose estimation from multiple views. In CVPR.
[49]
Zijian Dong, Jie Song, Xu Chen, Chen Guo, and Otmar Hilliges. 2021. Shape-aware multi-person pose estimation from multi-view images. In ICCV.
[50]
Dylan Drover, Ching-Hang Chen, Amit Agrawal, Ambrish Tyagi, and Cong Phuoc Huynh. 2018. Can 3D pose be learned from 2D projections alone? In ECCV.
[51]
Marcin Eichner, Manuel Marin-Jimenez, Andrew Zisserman, and Vittorio Ferrari. 2012. 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. In Int. J. Comput. Vis. Vol. 99, 190–214.
[52]
Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Neural architecture search: A survey. J. Mach. Learn. Res. Vol. 20, 1997–2017.
[53]
Matteo Fabbri, Fabio Lanzi, Simone Calderara, Stefano Alletto, and Rita Cucchiara. 2020. Compressed volumetric heatmaps for multi-person 3D pose estimation. In CVPR.
[54]
Xiaochuan Fan, Kang Zheng, Yuewei Lin, and Song Wang. 2015. Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation. In CVPR.
[55]
Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, and Cewu Lu. 2017. RMPE: Regional Multi-person Pose Estimation. In ICCV.
[56]
Mihai Fieraru, Anna Khoreva, Leonid Pishchulin, and Bernt Schiele. 2018. Learning to refine human pose estimation. In CVPR Workshops.
[57]
Martin Fisch and Ronald Clark. 2020. Orientation keypoints for 6D human pose estimation. arXiv preprint arXiv:2009.04930.
[58]
Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, and Ziyan Wu. 2020. Hierarchical kinematic human mesh recovery. In ECCV.
[59]
Georgia Gkioxari, Alexander Toshev, and Navdeep Jaitly. 2016. Chained predictions using convolutional neural networks. In ECCV.
[60]
Wenjuan Gong, Xuena Zhang, Jordi Gonzàlez, Andrews Sobral, Thierry Bouwmans, Changhe Tu, and El-hadi Zahzah. 2016. Human pose estimation from monocular images: A comprehensive survey. In Sensors, Vol. 16, 1966.
[61]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NeurIPS.
[62]
Yiwen Gu, Shreya Pandit, Elham Saraee, Timothy Nordahl, Terry Ellis, and Margrit Betke. 2019. Home-based physical therapy with an interactive computer vision system. In ICCV Workshops.
[63]
Hengkai Guo, Tang Tang, Guozhong Luo, Riwei Chen, Yongchen Lu, and Linfu Wen. 2018. Multi-domain pose network for multi-person pose estimation and tracking. In ECCV Workshops.
[64]
I. Habibie, W. Xu, D. Mehta, G. Pons-Moll, and C. Theobalt. 2019. In the wild human pose estimation using explicit 2D features and intermediate 3D representations. In CVPR.
[65]
Mohamed Hassan, Vasileios Choutas, Dimitrios Tzionas, and Michael J. Black. 2019. Resolving 3D human pose ambiguities with 3D scene constraints. In ICCV.
[66]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
[67]
Michael B. Holte, Cuong Tran, Mohan M. Trivedi, and Thomas B. Moeslund. 2012. Human pose estimation and activity recognition from multi-view videos: Comparative explorations of recent developments. In IEEE J. Sel. Top. Signal Process. Vol. 6, 538–552.
[68]
Yilei Hua, Wenhan Wu, Ce Zheng, Aidong Lu, Mengyuan Liu, Chen Chen, and Shiqian Wu. 2023. Part aware contrastive learning for self-supervised action recognition. In IJCAI.
[69]
Congzhentao Huang, Shuai Jiang, Yang Li, Ziyue Zhang, Jason Traish, Chen Deng, Sam Ferguson, and Richard Yi Da Xu. 2020. End-to-end dynamic matching network for multi-view multi-person 3D pose estimation. In ECCV.
[70]
Fuyang Huang, Ailing Zeng, Minhao Liu, Qiuxia Lai, and Qiang Xu. 2020. DeepFuse: An IMU-aware network for real-time 3D human pose estimation from multi-view image. In WACV.
[71]
Junjie Huang, Zheng Zhu, Feng Guo, and Guan Huang. 2020. The devil is in the details: Delving into unbiased data processing for human pose estimation. In CVPR.
[72]
Shaoli Huang, Mingming Gong, and Dacheng Tao. 2017. A coarse-fine network for keypoint localization. In ICCV.
[73]
Xiaofei Huang, Nihang Fu, Shuangjun Liu, Kathan Vyas, Amirreza Farnoosh, and Sarah Ostadabbas. 2020. Invariant representation learning for infant pose estimation with small data. arXiv preprint arXiv:2010.06100.
[74]
Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael J. Black, Otmar Hilliges, and Gerard Pons-Moll. 2018. Deep inertial poser: Learning to reconstruct human pose from sparse inertial measurements in real time. In ACM Trans. Graph.
[75]
Eldar Insafutdinov, Mykhaylo Andriluka, Leonid Pishchulin, Siyu Tang, Evgeny Levinkov, Bjoern Andres, and Bernt Schiele. 2017. ArtTrack: Articulated multi-person tracking in the wild. In CVPR.
[76]
Eldar Insafutdinov, Leonid Pishchulin, Bjoern Andres, Mykhaylo Andriluka, and Bernt Schiele. 2016. DeeperCut: A deeper, stronger, and faster multi-person pose estimation model. In ECCV.
[77]
C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu. 2014. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. In IEEE Trans. Pattern Anal. Mach. Intell. Vol. 36, 1325–1339.
[78]
Umar Iqbal and Juergen Gall. 2016. Multi-person pose estimation with local joint-to-person associations. In ECCV.
[79]
Mariko Isogawa, Ye Yuan, Matthew O’Toole, and Kris M. Kitani. 2020. Optical non-line-of-sight physics-based 3D human pose estimation. In CVPR.
[80]
Ehsan Jahangiri and Alan L. Yuille. 2017. Generating multiple diverse hypotheses for human 3D pose consistent with 2D joint detections. In ICCV Workshops.
[81]
Arjun Jain, Jonathan Tompson, Yann LeCun, and Christoph Bregler. 2014. MoDeep: A deep learning framework using motion features for human pose estimation. In ACCV.
[82]
Naman Jain, Sahil Shah, Abhishek Kumar, and Arjun Jain. 2019. On the robustness of human pose estimation. In CVPR Workshops.
[83]
H. Jhuang, J. Gall, S. Zuffi, C. Schmid, and M. J. Black. 2013. Towards understanding action recognition. In ICCV.
[84]
Xiaopeng Ji, Qi Fang, Junting Dong, Qing Shuai, Wen Jiang, and hou. 2020. A survey on monocular 3D human pose estimation. In Virt. Real. Intell. Hardw. Vol. 2, 471–500.
[85]
Xiaofei Ji and Honghai Liu. 2009. Advances in view-invariant human motion analysis: A review. In IEEE Trans. Syst., Man, Cybern. Vol. 40, 13–24.
[86]
Haiyong Jiang, Jianfei Cai, and Jianmin Zheng. 2019. Skeleton-aware 3D human shape reconstruction from point clouds. In ICCV.
[87]
Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, and Kostas Daniilidis. 2020. Coherent reconstruction of multiple humans from a single image. In CVPR.
[88]
Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, and Ping Luo. 2020. Differentiable hierarchical graph grouping for multi-person pose estimation. arXiv preprint arXiv:2007.11864.
[89]
Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, and Ping Luo. 2020. Whole-body human pose estimation in the wild. arXiv preprint arXiv:2007.11858.
[90]
Sam Johnson and Mark Everingham. 2010. Clustered pose and nonlinear appearance models for human pose estimation. In BMVC.
[91]
Sam Johnson and Mark Everingham. 2011. Learning effective human pose estimation from inaccurate annotation. In CVPR.
[92]
Hanbyul Joo, Tomas Simon, Xulong Li, Hao Liu, Lei Tan, Lin Gui, Sean Banerjee, Timothy Scott Godisart, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh. 2017. Panoptic studio: A massively multiview system for social interaction capture. In IEEE Trans. Pattern Anal. Mach. Intell. 3334–3342.
[93]
H. Joo, T. Simon, and Y. Sheikh. 2018. Total capture: A 3D deformation model for tracking faces, hands, and bodies. In CVPR.
[94]
A. Kadkhodamohammadi, A. Gangi, M. de Mathelin, and N. Padoy. 2017. A multi-view RGB-D approach for human pose estimation in operating rooms. In WACV.
[95]
Ladislav Kavan. 2014. Part I: Direct skinning methods and deformation primitives. In ACM SIGGRAPH.
[96]
Lipeng Ke, Ming-Ching Chang, Honggang Qi, and Siwei Lyu. 2018. Multi-scale structure-aware network for human pose estimation. In ECCV.
[97]
Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. 2021. Transformers in vision: A survey. arXiv preprint arXiv:2101.01169.
[98]
Muhammed Kocabas, Nikos Athanasiou, and Michael J. Black. 2020. VIBE: Video Inference for human Body pose and shape Estimation. In CVPR.
[99]
Muhammed Kocabas, Salih Karagoz, and Emre Akbas. 2018. MultiPoseNet: Fast multi-person pose estimation using pose residual network. In ECCV.
[100]
Muhammed Kocabas, Salih Karagoz, and Emre Akbas. 2019. Self-supervised learning of 3D human pose using multi-view geometry. In CVPR.
[101]
N. Kolotouros, G. Pavlakos, M. Black, and K. Daniilidis. 2019. Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In ICCV.
[102]
Nikos Kolotouros, Georgios Pavlakos, and Kostas Daniilidis. 2019. Convolutional mesh regression for single-image human shape reconstruction. In CVPR.
[103]
Nikos Kolotouros, Georgios Pavlakos, Dinesh Jayaraman, and Kostas Daniilidis. 2021. Probabilistic modeling for human mesh recovery. In ICCV.
[104]
Sven Kreiss, Lorenzo Bertoni, and Alexandre Alahi. 2019. PifPaf: Composite fields for human pose estimation. In CVPR.
[105]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In NeurIPS.
[106]
Jogendra Nath Kundu, Ambareesh Revanur, Govind Vitthal Waghmare, Rahul Mysore Venkatesh, and R. Venkatesh Babu. 2020. Unsupervised cross-modal alignment for multi-person 3D pose estimation. In ECCV.
[107]
Jogendra Nath Kundu, Siddharth Seth, Varun Jampani, Mugalodi Rakesh, R. Venkatesh Babu, and Anirban Chakraborty. 2020. Self-supervised 3D human pose estimation via part guided novel image synthesis. In CVPR.
[108]
Jogendra Nath Kundu, Siddharth Seth, Mv Rahul, Mugalodi Rakesh, Venkatesh Babu Radhakrishnan, and Anirban Chakraborty. 2020. Kinematic-structure-preserved representation for unsupervised 3D human pose estimation. In AAAI.
[109]
Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, and Peter V. Gehler. 2017. Unite the people: Closing the loop between 3D and 2D human representations. In CVPR.
[110]
Kyoungoh Lee, Inwoong Lee, and Sanghoon Lee. 2018. Propagating LSTM: 3D pose estimation based on joint interdependency. In ECCV.
[111]
Chen Li and Gim Hee Lee. 2019. Generating multiple hypotheses for 3D human pose estimation with mixture density network. In CVPR.
[112]
Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, and Cewu Lu. 2021. Human pose regression with residual log-likelihood estimation. In ICCV.
[113]
Jiefeng Li, Can Wang, Wentao Liu, Chen Qian, and Cewu Lu. 2020. HMOR: Hierarchical Multi-person Ordinal Relations for monocular multi-person 3D pose estimation. In ECCV.
[114]
Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, and Cewu Lu. 2019. CrowdPose: Efficient crowded scenes pose estimation and a new benchmark. In ICCV.
[115]
Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, and Cewu Lu. 2021. HybrIK: A hybrid analytical-neural inverse kinematics solution for 3D human pose and shape estimation. In CVPR.
[116]
Ke Li, Shijie Wang, Xiang Zhang, Yifan Xu, Weijian Xu, and Zhuowen Tu. 2021. Pose recognition with cascade transformers. In CVPR.
[117]
Sijin Li and Antoni B. Chan. 2014. 3D human pose estimation from monocular images with deep convolutional neural network. In ACCV.
[118]
Sijin Li, Zhi-Qiang Liu, and Antoni B. Chan. 2014. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. In CVPR Workshops.
[119]
Sijin Li, Weichen Zhang, and Antoni B. Chan. 2015. Maximum-margin structured learning with deep networks for 3D human pose estimation. In ICCV.
[120]
Wenhao Li, Hong Liu, Hao Tang, Pichao Wang, and Luc Van Gool. 2022. MHFormer: Multi-hypothesis transformer for 3D human pose estimation. In CVPR.
[121]
Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, and Jian Sun. 2019. Rethinking on multi-stage networks for human pose estimation. arXiv preprint arXiv:1901.00148.
[122]
Yizhuo Li, Miao Hao, Zonglin Di, Nitesh Bharadwaj Gundavarapu, and Xiaolong Wang. 2021. Test-time personalization with a transformer for human pose estimation. In NeurIPS.
[123]
Yining Li, Chen Huang, and Chen Change Loy. 2019. Dense intrinsic appearance flow for human pose transfer. In CVPR.
[124]
Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, and Shu-Tao Xia. 2022. SimCC: A simple coordinate classification perspective for human pose estimation. In ECCV.
[125]
Yanjie Li, Shoukui Zhang, Zhicheng Wang, Sen Yang, Wankou Yang, Shu-Tao Xia, and Erjin Zhou. 2021. TokenPose: Learning keypoint tokens for human pose estimation. In ICCV.
[126]
Zhongguo Li, Anders Heyden, and Magnus Oskarsson. 2020. A novel joint points and silhouette-based method to estimate 3D human pose and shape. arXiv preprint arXiv:2012.06109.
[127]
Z. Li, X. Wang, F. Wang, and P. Jiang. 2019. On boosting single-frame 3D human pose estimation via monocular videos. In ICCV.
[128]
Junbang Liang and Ming C. Lin. 2019. Shape-aware human pose and shape reconstruction using multi-view images. In ICCV.
[129]
Ita Lifshitz, Ethan Fetaya, and Shimon Ullman. 2016. Human pose estimation using deep consensus voting. In ECCV.
[130]
Kevin Lin, Lijuan Wang, and Zicheng Liu. 2021. End-to-end human pose and mesh reconstruction with transformers. In CVPR.
[131]
Kevin Lin, Lijuan Wang, and Zicheng Liu. 2021. Mesh graphormer. In ICCV.
[132]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In ECCV.
[133]
Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, and Nicu Sebe. 2020. Human in events: A large-scale benchmark for human-centric video analysis in complex events. arXiv preprint arXiv:2005.04490.
[134]
Huajun Liu, Fuqiang Liu, Xinyi Fan, and Dong Huang. 2021. Polarized self-attention: Towards high-quality pixel-wise regression. arXiv preprint arXiv:2107.00782.
[135]
Jian Liu, Naveed Akhtar, and Ajmal Mian. 2019. Adversarial attack on skeleton-based human action recognition. arXiv preprint arXiv:1909.06500.
[136]
Jingyuan Liu, Hongbo Fu, and Chiew-Lan Tai. 2020. PoseTween: Pose-driven tween animation. In ACM UIST.
[137]
Kenkun Liu, Rongqi Ding, Zhiming Zou, Le Wang, and Wei Tang. 2020. A comprehensive study of weight sharing in graph networks for 3D human pose estimation. In ECCV.
[138]
Ruixu Liu, Ju Shen, He Wang, Chen Chen, Sen-ching Cheung, and Vijayan Asari. 2020. Attention mechanism exploits temporal contexts: Real-time 3D human pose reconstruction. In CVPR.
[139]
Zhenguang Liu, Haoming Chen, Runyang Feng, Shuang Wu, Shouling Ji, Bailin Yang, and Xun Wang. 2021. Deep dual consecutive network for human pose estimation. In CVPR.
[140]
Zhenguang Liu, Runyang Feng, Haoming Chen, Shuang Wu, Yixing Gao, Yunjun Gao, and Xiang Wang. 2022. Temporal feature alignment and mutual information maximization for video-based human pose estimation. In CVPR.
[141]
Zhao Liu, Jianke Zhu, Jiajun Bu, and Chun Chen. 2015. A survey of human pose estimation: The body parts parsing based methods. In J. Vis. Commun. Image Repres. Vol. 32, 10–19.
[142]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR.
[143]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. In ACM Trans. Graph. Vol. 34, 1–16.
[144]
Mandy Lu, Kathleen Poston, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Kilian M. Pohl, Juan Carlos Niebles, and Ehsan Adeli. 2020. Vision-based estimation of MDS-UPDRS gait scores for assessing Parkinson’s disease motor severity. arXiv preprint arXiv:2007.08920.
[145]
Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, and Liang Lin. 2018. LSTM pose machines. In CVPR.
[146]
Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan, and Erjin Zhou. 2021. Rethinking the heatmap regression for bottom-up human pose estimation. In CVPR.
[147]
Diogo C. Luvizon, David Picard, and Hedi Tabia. 2018. 2D/3D pose estimation and action recognition using multitask deep learning. In CVPR.
[148]
Diogo C. Luvizon, Hedi Tabia, and David Picard. 2019. Human pose regression by combining indirect part detection and contextual information. In Comput. Graph, Vol. 85, 15–22.
[149]
Haoyu Ma, Liangjian Chen, Deying Kong, Zhe Wang, Xingwei Liu, Hao Tang, Xiangyi Yan, Yusheng Xie, Shih-Yao Lin, and Xiaohui Xie. 2021. TransFusion: Cross-view fusion with transformer for 3D human pose estimation. In BMVC.
[150]
Haoyu Ma, Zhe Wang, Yifei Chen, Deying Kong, Liangjian Chen, Xingwei Liu, Xiangyi Yan, Hao Tang, and Xiaohui Xie. 2022. PPT: Token-pruned pose transformer for monocular and multi-view human pose estimation. In ECCV.
[151]
Prathmesh Madhu, Angel Villar-Corrales, Ronak Kosti, Torsten Bendschus, Corinna Reinhardt, Peter Bell, Andreas Maier, and Vincent Christlein. 2020. Enhancing human pose estimation in ancient vase paintings via perceptually-grounded style transfer learning. arXiv preprint arXiv:2012.05616.
[152]
Naureen Mahmood, Nima Ghorbani, Nikolaus F. Troje, Gerard Pons-Moll, and Michael J. Black. 2019. AMASS: Archive of Motion capture As Surface Shapes. In ICCV.
[153]
Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, and Zhibin Wang. 2021. TFPose: Direct human pose estimation with transformers. arXiv preprint arXiv:2103.15320.
[154]
Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, and Anton van den Hengel. 2022. Poseur: Direct human pose regression with transformers. In ECCV.
[155]
Amir Markovitz, Gilad Sharir, Itamar Friedman, Lihi Zelnik-Manor, and Shai Avidan. 2020. Graph embedded pose clustering for anomaly detection. In CVPR.
[156]
Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little. 2017. A simple yet effective baseline for 3D human pose estimation. In ICCV.
[157]
D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko, W. Xu, and C. Theobalt. 2017. Monocular 3D human pose estimation in the wild using improved CNN supervision. In 3DV.
[158]
Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Mohamed Elgharib, Pascal Fua, Hans-Peter Seidel, Helge Rhodin, Gerard Pons-Moll, and Christian Theobalt. 2020. XNect: Real-time multi-person 3D motion capture with a single RGB camera. In ACM TOG, Vol. 39, 82–1.
[159]
Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Srinath Sridhar, Gerard Pons-Moll, and Christian Theobalt. 2018. Single-shot multi-person 3D pose estimation from monocular RGB. In 3DV.
[160]
Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel et al. 2017. VNect: Real-time 3D human pose estimation with a single RGB camera. In ACM TOG, Vol. 36. 1–14.
[161]
Antonio S. Micilotta, Eng-Jon Ong, and Richard Bowden. 2006. Real-time upper body detection and 3D pose estimation in monoscopic images. In ECCV.
[162]
Rahul Mitra, Nitesh B. Gundavarapu, Abhishek Sharma, and Arjun Jain. 2020. Multiview-consistent semi-supervised learning for 3D human pose estimation. In CVPR.
[163]
Thomas B. Moeslund and Erik Granum. 2001. A survey of computer vision-based human motion capture. In Comput. Vis. Image Underst. Vol. 81, 231–268.
[164]
Thomas B. Moeslund, Adrian Hilton, and Volker Krüger. 2006. A survey of advances in vision-based human motion capture and analysis. In Comput. Vis. Image Underst. Vol. 104, 90–126.
[165]
Gyeongsik Moon, Juyong Chang, and Kyoung Mu Lee. 2019. Camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image. In ICCV.
[166]
Gyeongsik Moon, Ju Yong Chang, and Kyoung Mu Lee. 2019. PoseFix: Model-agnostic general human pose refinement network. In CVPR.
[167]
Gyeongsik Moon and Kyoung Mu Lee. 2020. I2L-MeshNet: Image-to-Lixel prediction network for accurate 3D human pose and mesh estimation from a single RGB image. In ECCV.
[168]
Francesc Moreno-Noguer. 2017. 3D human pose estimation from a single image via distance matrix regression. In CVPR.
[169]
Tewodros Legesse Munea, Yalew Zelalem Jembre, Halefom Tekle Weldegebriel, Longbiao Chen, Chenxi Huang, and Chenhui Yang. 2020. The progress of human pose estimation: A survey and taxonomy of models applied in 2D human pose estimation. In IEEE Access, Vol. 8, 133330–133348.
[170]
Alejandro Newell, Zhiao Huang, and Jia Deng. 2017. Associative embedding: End-to-end learning for joint detection and grouping. In NeurIPS.
[171]
Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In ECCV.
[172]
Aiden Nibali, Zhen He, Stuart Morgan, and Luke Prendergast. 2018. Numerical coordinate regression with convolutional neural networks. arXiv preprint arXiv:1801.07372.
[173]
B. X. Nie, P. Wei, and S. Zhu. 2017. Monocular 3D human pose estimation by predicting depth on joints. In ICCV.
[174]
Qiang Nie, Ziwei Liu, and Yunhui Liu. 2020. Unsupervised human 3D pose representation with viewpoint and pose disentanglement. In ECCV.
[175]
Xuecheng Nie, Jiashi Feng, Jianfeng Zhang, and Shuicheng Yan. 2019. Single-stage multi-person pose machines. In ICCV.
[176]
Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter V. Gehler, and Bernt Schiele. 2018. Neural body fitting: Unifying deep learning and model-based human pose and shape estimation. In 3DV.
[177]
Ahmed A. A. Osman, Timo Bolkart, and Michael J. Black. 2020. STAR: A spare trained articulated human body regressor. In ECCV.
[178]
Paschalis Panteleris and Antonis Argyros. 2021. PE-former: Pose estimation transformer. arXiv preprint arXiv:2112.04981.
[179]
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. PersonLab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In ECCV.
[180]
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin Murphy. 2017. Towards accurate multi-person pose estimation in the wild. In CVPR.
[181]
Chaitanya Patel, Zhouyingcheng Liao, and Gerard Pons-Moll. 2020. TailorNet: Predicting clothing in 3D as a function of human pose, shape and garment style. In CVPR.
[182]
Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, and Michael J. Black. 2019. Expressive body capture: 3D hands, face, and body from a single image. In CVPR.
[183]
Georgios Pavlakos, Xiaowei Zhou, and Kostas Daniilidis. 2018. Ordinal depth supervision for 3D human pose estimation. In CVPR.
[184]
Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis. 2017. Coarse-to-fine volumetric prediction for single-image 3D human pose. In CVPR.
[185]
Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis. 2017. Harvesting multiple views for marker-less 3D human pose annotations. In CVPR.
[186]
Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, and Kostas Daniilidis. 2018. Learning to estimate 3D Human pose and shape from a single color image. In CVPR.
[187]
Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 2019. 3D human pose estimation in video with temporal convolutions and semi-supervised training. In CVPR.
[188]
Xi Peng, Zhiqiang Tang, Fei Yang, Rogerio S. Feris, and Dimitris Metaxas. 2018. Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. In CVPR.
[189]
Tomas Pfister, James Charles, and Andrew Zisserman. 2015. Flowing convnets for human pose estimation in videos. In ICCV.
[190]
Tomas Pfister, Karen Simonyan, James Charles, and Andrew Zisserman. 2014. Deep convolutional neural networks for efficient pose estimation in gesture videos. In ACCV.
[191]
Aleksis Pirinen, Erik Gärtner, and Cristian Sminchisescu. 2019. Domes to drones: Self-supervised active triangulation for 3D human pose reconstruction. In NeurIPS.
[192]
Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter V. Gehler, and Bernt Schiele. 2016. DeepCut: Joint subset partition and labeling for multi person pose estimation. In CVPR.
[193]
Gerard Pons-Moll, Javier Romero, Naureen Mahmood, and Michael J. Black. 2015. Dyna: A model of dynamic human shape in motion. In ACM TOG, Vol. 34, 1–14.
[194]
Ronald Poppe. 2007. Vision-based human motion analysis: An overview. In Comput. Vis. Image Underst. Vol. 108, 4–18.
[195]
Ammar Qammaz and Antonis A. Argyros. 2019. MocapNET: Ensemble of SNN encoders for 3D human pose estimation in RGB images. In BMVC.
[196]
Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In CVPR.
[197]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In NeurIPS.
[198]
Haibo Qiu, Chunyu Wang, Jingdong Wang, Naiyan Wang, and Wenjun Zeng. 2019. Cross view fusion for 3D human pose estimation. In ICCV.
[199]
Lingteng Qiu, Xuanye Zhang, Yanran Li, Guanbin Li, Xiaojun Wu, Zixiang Xiong, Xiaoguang Han, and Shuguang Cui. 2020. Peeking into occluded joints: A novel framework for crowd pose estimation. arXiv preprint arXiv:2003.10506.
[200]
Varun Ramakrishna, Daniel Munoz, Martial Hebert, James Andrew Bagnell, and Yaser Sheikh. 2014. Pose machines: Articulated pose estimation via inference machines. In ECCV.
[201]
Mir Rayat Imtiaz Hossain and James J. Little. 2018. Exploiting temporal information for 3D human pose estimation. In ECCV.
[202]
N. Dinesh Reddy, Laurent Guigues, Leonid Pishchulin, Jayan Eledath, and Srinivasa G. Narasimhan. 2021. TesseTrack: End-to-end learnable multi-person articulated 3D pose tracking. In CVPR.
[203]
Edoardo Remelli, Shangchen Han, Sina Honari, Pascal Fua, and Robert Wang. 2020. Lightweight multi-view 3D pose estimation through camera-disentangled representation. In CVPR.
[204]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS.
[205]
Helge Rhodin, Mathieu Salzmann, and Pascal Fua. 2018. Unsupervised geometry-aware representation for 3D human pose estimation. In ECCV.
[206]
Helge Rhodin, Jörg Spörri, Isinsu Katircioglu, Victor Constantin, Frédéric Meyer, Erich Müller, Mathieu Salzmann, and Pascal Fua. 2018. Learning monocular 3D human pose estimation from multi-view images. In CVPR.
[207]
G. Rogez, P. Weinzaepfel, and C. Schmid. 2017. LCR-Net: Localization-classification-regression for human pose. In CVPR.
[208]
Grégory Rogez, Philippe Weinzaepfel, and Cordelia Schmid. 2019. LCR-Net++: Multi-person 2D and 3D pose detection in natural images. In IEEE Trans. Pattern Anal. Mach. Intell. Vol. 42, 1146–1161.
[209]
Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098.
[210]
Nitin Saini, Eric Price, Rahul Tallamraju, Raffi Enficiaud, Roman Ludwig, Igor Martinović, Aamir Ahmad, and Michael Black. 2019. Markerless outdoor human motion capture using multiple autonomous micro aerial vehicles. In ICCV.
[211]
Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. 2019. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In ICCV.
[212]
Shunsuke Saito, Tomas Simon, Jason Saragih, and Hanbyul Joo. 2020. PIFuHD: Multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In CVPR.
[213]
Ben Sapp and Ben Taskar. 2013. MODEC: Multimodal decomposable models for human pose estimation. In CVPR.
[214]
Nikolaos Sarafianos, Bogdan Boteanu, Bogdan Ionescu, and Ioannis A. Kakadiaris. 2016. 3D human pose estimation: A review of the literature and analysis of covariates. In Comput. Vis. Image Underst. Vol. 152, 1–20.
[215]
Saurabh Sharma, Pavan Teja Varigonda, Prashast Bindal, Abhishek Sharma, and Arjun Jain. 2019. Monocular 3D human pose estimation by generation and ordinal ranking. In ICCV.
[216]
Dahu Shi, Xing Wei, Liangqi Li, Ye Ren, and Wenming Tan. 2022. End-to-end multi-person pose estimation with transformers. In CVPR.
[217]
L. Sigal, A. Balan, and M. J. Black. 2010. HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. In Int. J. Comput. Vis. Vol. 87, 4.
[218]
Michael Snower, Asim Kadav, Farley Lai, and Hans Peter Graf. 2020. 15 keypoints is all you need. In CVPR.
[219]
Kai Su, Dongdong Yu, Zhenqi Xu, Xin Geng, and Changhu Wang. 2019. Multi-person pose estimation with enhanced channel-wise and spatial information. In CVPR.
[220]
Jennifer J. Sun, Jiaping Zhao, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, and Ting Liu. 2020. View-invariant probabilistic embedding for human pose. In ECCV.
[221]
Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep high-resolution representation learning for human pose estimation. In CVPR.
[222]
Xiao Sun, Jiaxiang Shang, Shuang Liang, and Yichen Wei. 2017. Compositional human pose regression. In ICCV.
[223]
Wei Tang and Ying Wu. 2019. Does learning specific features for related parts help human pose estimation? In CVPR.
[224]
Wei Tang, Pei Yu, and Ying Wu. 2018. Deeply learned compositional models for human pose estimation. In ECCV.
[225]
Bugra Tekin, Isinsu Katircioglu, Mathieu Salzmann, Vincent Lepetit, and Pascal Fua. 2016. Structured prediction of 3D human pose with deep neural networks. In BMVC.
[226]
Bugra Tekin, Pablo Márquez-Neila, Mathieu Salzmann, and Pascal Fua. 2017. Learning to fuse 2D and 3D image cues for monocular body pose estimation. In ICCV.
[227]
Bugra Tekin, Artem Rozantsev, Vincent Lepetit, and Pascal Fua. 2016. Direct prediction of 3D body poses from motion compensated sequences. In CVPR.
[228]
Zhi Tian, Hao Chen, and Chunhua Shen. 2019. DirectPose: Direct end-to-end multi-person pose estimation. arXiv preprint arXiv:1911.07451.
[229]
Denis Tome, Thiemo Alldieck, Patrick Peluse, Gerard Pons-Moll, Lourdes Agapito, Hernan Badino, and Fernando De la Torre. 2020. SelfPose: 3D egocentric pose estimation from a headset mounted camera. arXiv preprint arXiv:2011.01519.
[230]
Denis Tome, Patrick Peluse, Lourdes Agapito, and Hernan Badino. 2019. xR-EgoPose: Egocentric 3D human pose from an HMD camera. In ICCV.
[231]
Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. 2015. Efficient object localization using convolutional networks. In CVPR.
[232]
Jonathan J. Tompson, Arjun Jain, Yann LeCun, and Christoph Bregler. 2014. Joint training of a convolutional network and a graphical model for human pose estimation. In NeurIPS.
[233]
Alexander Toshev and Christian Szegedy. 2014. DeepPose: Human pose estimation via deep neural networks. In CVPR.
[234]
Matt Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton, and John Collomosse. 2017. Total capture: 3D human pose estimation fusing video and inertial sensors. In BMVC.
[235]
Hanyue Tu, Chunyu Wang, and Wenjun Zeng. 2020. VoxelPose: Towards multi-camera 3D human pose estimation in wild environment. In ECCV.
[236]
Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, and Katerina Fragkiadaki. 2017. Self-supervised learning of motion capture. In NeurIPS.
[237]
Rafi Umer, Andreas Doering, Bastian Leibe, and Juergen Gall. 2020. Self-supervised keypoint correspondences for multi-person pose estimation and tracking in videos. arXiv preprint arXiv:2004.12652.
[238]
Gul Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev, and Cordelia Schmid. 2017. Learning from synthetic humans. In CVPR.
[239]
Ignas Budvytis, Vince Tan, and Roberto Cipolla. 2017. Indirect deep structured learning for 3D human body shape and pose prediction. In BMVC.
[240]
Timo von Marcard, Roberto Henschel, Michael J. Black, Bodo Rosenhahn, and Gerard Pons-Moll. 2018. Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In ECCV.
[241]
Timo Von Marcard, Bodo Rosenhahn, Michael J. Black, and Gerard Pons-Moll. 2017. Sparse inertial poser: Automatic 3D human pose estimation from sparse IMUs. In Computer Graphics Forum.
[242]
Bastian Wandt and Bodo Rosenhahn. 2019. RepNet: Weakly supervised training of an adversarial reprojection network for 3D human pose estimation. In CVPR.
[243]
Haoyang Wang, Riza Alp Guler, Iasonas Kokkinos, George Papandreou, and Stefanos Zafeiriou. 2020. BLSM: A bone-level skinned model of the human mesh. In ECCV.
[244]
Haixin Wang, Lu Zhou, Yingying Chen, Ming Tang, and Jinqiao Wang. 2022. Regularizing vector embedding in bottom-up human pose estimation. In ECCV.
[245]
Jue Wang, Shaoli Huang, Xinchao Wang, and Dacheng Tao. 2019. Not all parts are created equal: 3D pose estimation by modeling bi-directional dependencies of body parts. In ICCV.
[246]
Jian Wang, Xiang Long, Yuan Gao, Errui Ding, and Shilei Wen. 2020. Graph-PCNN: Two stage human pose estimation with graph pose refinement. arXiv preprint arXiv:2007.10599.
[247]
Jianbo Wang, Kai Qiu, Houwen Peng, Jianlong Fu, and Jianke Zhu. 2019. AI Coach: Deep human pose estimation and analysis for personalized athletic training assistance. In ACM MM.
[248]
Jingbo Wang, Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2020. Motion guided 3D pose estimation from videos. In ECCV.
[249]
Kangkan Wang, Jin Xie, Guofeng Zhang, Lei Liu, and Jian Yang. 2020. Sequential 3D human pose and shape estimation from point clouds. In CVPR.
[250]
Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, and Lizhuang Ma. 2018. DRPose3D: Depth ranking in 3D human pose estimation. In IJCAI.
[251]
Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, and Jiashi Feng. 2021. Direct multi-view multi-person 3D human pose estimation. In NeurIPS.
[252]
Yihan Wang, Muyang Li, Han Cai, Wei-Ming Chen, and Song Han. 2022. LitePose: Efficient architecture design for 2D human pose estimation. In CVPR.
[253]
Zitian Wang, Xuecheng Nie, Xiaochao Qu, Yunpeng Chen, and Si Liu. 2022. Distribution-aware single-stage models for multi-person 3D pose estimation. In CVPR.
[254]
Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In CVPR.
[255]
C. Weng, B. Curless, and I. Kemelmacher-Shlizerman. 2019. Photo Wake-Up: 3D character animation from a single photo. In CVPR.
[256]
Nora S. Willett, Hijung Valentina Shin, Zeyu Jin, Wilmot Li, and Adam Finkelstein. 2020. Pose2Pose: Pose selection and transfer for 2D character animation. In IUI.
[257]
Jiahong Wu, He Zheng, Bo Zhao, Yixin Li, Baoming Yan, Rui Liang, Wenjia Wang, Shipei Zhou et al. 2017. AI challenger: A large-scale dataset for going deeper in image understanding. arXiv preprint arXiv:1711.06475.
[258]
Donglai Xiang, Hanbyul Joo, and Yaser Sheikh. 2019. Monocular total capture: Posing face, body, and hands in the wild. In CVPR.
[259]
Bin Xiao, Haiping Wu, and Yichen Wei. 2018. Simple baselines for human pose estimation and tracking. In ECCV.
[260]
Rongchang Xie, Chunyu Wang, and Yizhou Wang. 2020. MetaFuse: A pre-trained fusion model for human pose estimation. In CVPR.
[261]
Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Zhou Tianyi, and Junsong Yuan. 2019. A2J: Anchor-to-joint regression network for 3D articulated pose estimation from a single depth image. In ICCV.
[262]
Hongyi Xu, Eduard Gabriel Bazavan, Andrei Zanfir, William T. Freeman, Rahul Sukthankar, and Cristian Sminchisescu. 2020. GHUM \(\&\) GHUML: Generative 3D human shape and articulated pose models. In CVPR.
[263]
Jingwei Xu, Zhenbo Yu, Bingbing Ni, Jiancheng Yang, Xiaokang Yang, and Wenjun Zhang. 2020. Deep kinematics analysis for monocular 3D human pose estimation. In CVPR.
[264]
Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, and Xiaogang Wang. 2021. ViPNAS: Efficient video pose estimation via neural architecture search. In CVPR.
[265]
Weipeng Xu, Avishek Chatterjee, Michael Zollhoefer, Helge Rhodin, Pascal Fua, Hans-Peter Seidel, and Christian Theobalt. 2019. Mo2Cap2: Real-time mobile 3D motion capture with a cap-mounted fisheye camera. In IEEE TVCG Proc. VR. Vol. 25, 2093–2101.
[266]
Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, and Fernando De la Torre. 2020. 3D human shape and pose from a single low-resolution image with self-supervised learning. In ECCV.
[267]
Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. In AAAI.
[268]
Sen Yang, Zhibin Quan, Mu Nie, and Wankou Yang. 2021. TransPose: Keypoint localization via transformer. In ICCV.
[269]
Wei Yang, Shuang Li, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2017. Learning feature pyramids for human pose estimation. In ICCV.
[270]
Wei Yang, Wanli Ouyang, Hongsheng Li, and Xiaogang Wang. 2016. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation. In CVPR.
[271]
Wei Yang, Wanli Ouyang, Xiaolong Wang, Jimmy Ren, Hongsheng Li, and Xiaogang Wang. 2018. 3D human pose estimation in the wild by adversarial learning. In CVPR.
[272]
Yi Yang and Deva Ramanan. 2012. Articulated human detection with flexible mixtures of parts. In IEEE Trans. Pattern Anal. Mach. Intell. Vol. 35, 2878–2890.
[273]
Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, and Yizhou Wang. 2022. Faster VoxelPose: Real-time 3D human pose estimation by orthographic projection. In ECCV.
[274]
Changqian Yu, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, and Jingdong Wang. 2021. Lite-HRNet: A lightweight high-resolution network. In CVPR.
[275]
T. Yu, J. Zhao, Z. Zheng, K. Guo, Q. Dai, H. Li, G. Pons-Moll, and Y. Liu. 2019. DoubleFusion: Real-time capture of human performances with inner body shapes from a single depth sensor. In IEEE Trans. Pattern Anal. Mach. Intell. 7287–7296.
[276]
Tao Yu, Zerong Zheng, Yuan Zhong, Jianhui Zhao, Qionghai Dai, Gerard Pons-Moll, and Yebin Liu. 2019. SimulCap: Single-view human performance capture with cloth simulation. In CVPR.
[277]
Yuhui Yuan, Rao Fu, Lang Huang, Weihong Lin, Chao Zhang, Xilin Chen, and Jingdong Wang. 2021. HRFormer: High-resolution vision transformer for dense predict. In NeurIPS.
[278]
Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, and Cristian Sminchisescu. 2020. Weakly supervised 3D human pose and shape reconstruction with normalizing flows. In ECCV.
[279]
A. Zanfir, E. Marinoiu, and C. Sminchisescu. 2018. Monocular 3D pose and shape estimation of multiple people in natural scenes: The importance of multiple scene constraints. In CVPR.
[280]
Andrei Zanfir, Elisabeta Marinoiu, Mihai Zanfir, Alin-Ionut Popa, and Cristian Sminchisescu. 2018. Deep network for the integrated 3D sensing of multiple people in natural images. In NeurIPS.
[281]
Ailing Zeng, Xiao Sun, Fuyang Huang, Minhao Liu, Qiang Xu, and Stephen Lin. 2020. SRNet: Improving generalization in 3D human pose estimation with a split-and-recombine approach. In ECCV.
[282]
W. Zeng, W. Ouyang, P. Luo, W. Liu, and X. Wang. 2020. 3D human mesh regression with dense correspondence. In CVPR.
[283]
Feng Zhang, Xiatian Zhu, Hanbin Dai, Mao Ye, and Ce Zhu. 2020. Distribution-aware coordinate representation for human pose estimation. In CVPR.
[284]
Feng Zhang, Xiatian Zhu, and Mao Ye. 2019. Fast human pose estimation. In CVPR.
[285]
Hong Zhang, Hao Ouyang, Shu Liu, Xiaojuan Qi, Xiaoyong Shen, Ruigang Yang, and Jiaya Jia. 2019. Human pose estimation with spatial contextual information. arXiv preprint arXiv:1901.01760.
[286]
Haotian Zhang, Cristobal Sciutto, Maneesh Agrawala, and Kayvon Fatahalian. 2020. Vid2Player: Controllable video sprites that behave and appear like professional tennis players. arXiv preprint arXiv:2008.04524.
[287]
Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, Limin Wang, and Zhenan Sun. 2021. PyMAF: 3D human pose and shape regression with pyramidal mesh alignment feedback loop. In ICCV.
[288]
Jinlu Zhang, Zhigang Tu, Jianyu Yang, Yujin Chen, and Junsong Yuan. 2022. MixSTE: Seq2seq mixed spatio-temporal encoder for 3D human pose estimation in video. In CVPR.
[289]
Tianshu Zhang, Buzhen Huang, and Yangang Wang. 2020. Object-occluded human shape and pose estimation from a single color image. In CVPR.
[290]
Wenqiang Zhang, Jiemin Fang, Xinggang Wang, and Wenyu Liu. 2020. EfficientPose: Efficient human pose estimation with neural architecture search. arXiv preprint arXiv:2012.07086.
[291]
Weiyu Zhang, Menglong Zhu, and Konstantinos G. Derpanis. 2013. From actemes to action: A strongly-supervised representation for detailed action understanding. In ICCV.
[292]
Yuxiang Zhang, Liang An, Tao Yu, Xiu Li, Kun Li, and Yebin Liu. 2020. 4D association graph for realtime multi-person motion capture using multiple video cameras. In CVPR.
[293]
Yuexi Zhang, Yin Wang, Octavia Camps, and Mario Sznaier. 2020. Key frame proposal network for efficient pose estimation in videos. arXiv preprint arXiv:2007.15217.
[294]
Zhe Zhang, Chunyu Wang, Wenhu Qin, and Wenjun Zeng. 2020. Fusing wearable IMUs with multi-view images for human pose estimation: A geometric approach. In CVPR.
[295]
Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, and Wenjun Zeng. 2020. AdaFuse: Adaptive multiview fusion for accurate human pose estimation in the wild. In Int. J. Comput. Vis. Vol. 129, 703–718.
[296]
Long Zhao, Xi Peng, Yu Tian, Mubbasir Kapadia, and Dimitris N. Metaxas. 2019. Semantic graph convolutional networks for 3D human pose regression. In CVPR.
[297]
Mingmin Zhao, Yingcheng Liu, Aniruddh Raghu, Tianhong Li, Hang Zhao, Antonio Torralba, and Dina Katabi. 2019. Through-wall human mesh recovery using radio signals. In ICCV.
[298]
Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, and Antonio Torralba. 2018. RF-based 3D skeletons. In SIGCOMM.
[299]
Qitao Zhao, Ce Zheng, Mengyuan Liu, Pichao Wang, and Chen Chen. 2023. PoseFormerV2: Exploring frequency domain for efficient and robust 3D human pose estimation. In CVPR.
[300]
Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao, and Xiaowei Zhou. 2020. SMAP: Single-shot multi-person absolute 3D pose estimation. In ECCV.
[301]
Ce Zheng, Xianpeng Liu, Guo-Jun Qi, and Chen Chen. 2023. POTTER: Pooling attention transformer for efficient human mesh recovery. In CVPR.
[302]
Ce Zheng, Matias Mendieta, Pu Wang, Aidong Lu, and Chen Chen. 2022. A lightweight graph transformer network for human mesh reconstruction from 2D human pose. In ACM Multimedia.
[303]
Ce Zheng, Matias Mendieta, Taojiannan Yang, Guo-Jun Qi, and Chen Chen. 2023. FeatER: An efficient network for human reconstruction via feature map-based TransformER. In CVPR.
[304]
Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, and Zhengming Ding. 2021. 3D human pose estimation with spatial and temporal transformers. In ICCV.
[305]
Tiancheng Zhi, Christoph Lassner, Tony Tung, Carsten Stoll, Srinivasa G. Narasimhan, and Minh Vo. 2020. TexMesh: Reconstructing detailed human texture and geometry from RGB-D video. In ECCV.
[306]
Keyang Zhou, Bharat Lal Bhatnagar, and Gerard Pons-Moll. 2020. Unsupervised shape and pose disentanglement for 3D meshes. arXiv preprint arXiv:2007.11341.
[307]
K. Zhou, X. Han, N. Jiang, K. Jia, and J. Lu. 2019. HEMlets pose: Learning part-centric heatmap triplets for accurate 3D human pose estimation. In ICCV.
[308]
Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, and Yichen Wei. 2017. Towards 3D human pose estimation in the wild: A weakly-supervised approach. In ICCV.
[309]
Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, and Yichen Wei. 2016. Deep kinematic pose regression. In ECCV.
[310]
Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. 2019. Objects as points. arXiv preprint arXiv:1904.07850.
[311]
Xiaowei Zhou, Menglong Zhu, Kosta Derpanis, and Kostas Daniilidis. 2016. Sparseness meets deepness: 3D human pose estimation from monocular video. In CVPR.
[312]
Xiaowei Zhou, Menglong Zhu, Georgios Pavlakos, Spyridon Leonardos, Konstantinos G. Derpanis, and Kostas Daniilidis. 2018. MonoCap: Monocular human motion capture using a CNN coupled with a geometric prior. In IEEE Trans. Pattern Anal. Mach. Intell. Vol. 41, 901–914.
[313]
Hao Zhu, Xinxin Zuo, Sen Wang, Xun Cao, and Ruigang Yang. 2019. Detailed human shape estimation from a single image by hierarchical mesh deformation. In CVPR.
[314]
Luyang Zhu, Konstantinos Rematas, Brian Curless, Steve Seitz, and Ira Kemelmacher-Shlizerman. 2020. Reconstructing NBA players. In ECCV.
[315]
Xiangyu Zhu, Yingying Jiang, and Zhenbo Luo. 2017. Multi-person pose estimation for PoseTrack with enhanced part affinity fields. In ICCV PoseTrack Workshop.
[316]
Zhiming Zou and Wei Tang. 2021. Modulated graph convolutional network for 3D human pose estimation. In ICCV.
[317]
Silvia Zuffi and Michael J. Black. 2015. The stitched puppet: A graphical model of 3D human shape and pose. In CVPR.

Cited By

View all
  • (2025)Enhanced Cross Layer Refinement Network for robust lane detection across diverse lighting and road conditionsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109473139(109473)Online publication date: Jan-2025
  • (2025)GTA-Net: An IoT-integrated 3D human pose estimation system for real-time adolescent sports posture correctionAlexandria Engineering Journal10.1016/j.aej.2024.10.099112(585-597)Online publication date: Jan-2025
  • (2024)AI-based Real-time Online Random-play Dance PlatformJournal of Digital Contents Society10.9728/dcs.2024.25.3.68525:3(685-693)Online publication date: 31-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 56, Issue 1
January 2024
918 pages
EISSN:1557-7341
DOI:10.1145/3613490
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 August 2023
Online AM: 09 June 2023
Accepted: 17 May 2023
Revised: 01 April 2023
Received: 23 January 2022
Published in CSUR Volume 56, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Survey of human pose estimation
  2. 2D and 3D pose estimation
  3. deep learning-based pose estimation
  4. pose estimation datasets
  5. pose estimation metrics

Qualifiers

  • Survey

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10,416
  • Downloads (Last 6 weeks)1,080
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Enhanced Cross Layer Refinement Network for robust lane detection across diverse lighting and road conditionsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109473139(109473)Online publication date: Jan-2025
  • (2025)GTA-Net: An IoT-integrated 3D human pose estimation system for real-time adolescent sports posture correctionAlexandria Engineering Journal10.1016/j.aej.2024.10.099112(585-597)Online publication date: Jan-2025
  • (2024)AI-based Real-time Online Random-play Dance PlatformJournal of Digital Contents Society10.9728/dcs.2024.25.3.68525:3(685-693)Online publication date: 31-Mar-2024
  • (2024)Real Time Fitness Tracking and Analysis using BlazePose Pose Estimation AdInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17883(550-554)Online publication date: 29-Apr-2024
  • (2024)Lightweight high-performance pose recognition network: HR-LiteNetElectronic Research Archive10.3934/era.202405532:2(1145-1159)Online publication date: 2024
  • (2024)Визначення правильної постави велосипедиста засобами комп'ютерного зоруScientific Bulletin of UNFU10.36930/4034031134:3(87-95)Online publication date: 28-Mar-2024
  • (2024)Enhanced Infant Movement Analysis Using Transformer-Based Fusion of Diverse Video Features for Neurodevelopmental MonitoringSensors10.3390/s2420661924:20(6619)Online publication date: 14-Oct-2024
  • (2024)Monocular 3D Multi-Person Pose Estimation for On-Site Joint Flexion Assessment: A Case of Extreme Knee Flexion DetectionSensors10.3390/s2419618724:19(6187)Online publication date: 24-Sep-2024
  • (2024)Human Joint Angle Estimation Using Deep Learning-Based Three-Dimensional Human Pose Estimation for Application in a Real EnvironmentSensors10.3390/s2412382324:12(3823)Online publication date: 13-Jun-2024
  • (2024)Human Motion Enhancement and Restoration via Unconstrained Human Structure LearningSensors10.3390/s2410312324:10(3123)Online publication date: 14-May-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media