Abstract
The current human pose estimation network has difficulty to be deployed on lightweight devices due to its large number of parameters. An effective solution is knowledge distillation, but there still exists the problem of insufficient learning ability of the student network: (1) There is an error avalanche problem in multi-teacher distillation. (2) There exists noise in heatmaps generated by teachers, which causes model degradation. (3) The effect of self-knowledge distillation is ignored. (4) Pose estimation is considered to be a regression problem but people usually ignore that it is also a classification problem. To address the above problems, we propose a densely guided self-knowledge distillation framework named DSKD to solve the error avalanche problem, propose a binarization operation to reduce the noise of the teacher’s heatmaps, and add a classification loss to the total loss to guide student’s learning. Experimental results show that our method effectively improves the performance of different lightweight models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: CVPR (2019)
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2d human pose estimation: new benchmark and state of the art analysis. In: CVPR (2014)
Belagiannis, V., Zisserman, A.: Recurrent human pose estimation. In: FG. IEEE (2017)
Bulat, A., Tzimiropoulos, G.: Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In: ICCV (2017)
Chen, D., Mei, J.P., Wang, C., Feng, Y., Chen, C.: Online knowledge distillation with diverse peers. In: AAAI (2020)
Chen, X., Yang, G.: Multi-person pose estimation with limb detection heatmaps. In: ICIP (2018)
Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation. In: CVPR (2018)
Cho, S., Maqbool, M., Liu, F., Foroosh, H.: Self-attention network for skeleton-based human action recognition. In: WACV (2020)
Dai, X., et al.: General instance distillation for object detection. In: CVPR (2021)
Fang, H.S., Xie, S., Tai, Y.W., Lu, C.: Rmpe: regional multi-person pose estimation. In: ICCV (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: ICCV (2017)
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Li, Y., Wang, C., Cao, Y., Liu, B., Tan, J., Luo, Y.: Human pose estimation based in-home lower body rehabilitation system. In: IJCNN (2020)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Mirzadeh, S.I., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. In: AAAI (2020)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: ECCV (2016)
Osokin, D.: Real-time 2d multi-person pose estimation on cpu: lightweight openpose. arXiv preprint arXiv:1811.12004 (2018)
Rafi, U., Leibe, B., Gall, J., Kostrikov, I.: An efficient convolutional network for human pose estimation. In: BMVC (2016)
Son, W., Na, J., Choi, J., Hwang, W.: Densely guided knowledge distillation using multiple teacher assistants. In: ICCV (2021)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: CVPR (2019)
Toshev, A., Szegedy, C.: Deeppose: Human pose estimation via deep neural networks. In: CVPR (2014)
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: ECCV (2018)
Xu, K., Rui, L., Li, Y., Gu, L.: Feature normalized knowledge distillation for image classification. In: ECCV (2020)
Xu, T., Takano, W.: Graph stacked hourglass networks for 3d human pose estimation. In: CVPR (2021)
Yu, C., et al.: Lite-hrnet: a lightweight high-resolution network. In: CVPR (2021)
Yuan, L., Tay, F.E., Li, G., Wang, T., Feng, J.: Revisiting knowledge distillation via label smoothing regularization. In: CVPR (2020)
Zhang, F., Zhu, X., Dai, H., Ye, M., Zhu, C.: Distribution-aware coordinate representation for human pose estimation. In: CVPR (2020)
Zhang, F., Zhu, X., Ye, M.: Fast human pose estimation. In: CVPR (2019)
Zhang, Z., Tang, J., Wu, G.: Lightweight human pose estimation under resource-limited scenes. In: ICASSP (2021)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grants 61976079, in part by Guangxi Key Research and Development Program under Grant 2021AB20147, and in part by Anhui Key Research and Development Program under Grant 202004a05020039.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, M., Zhao, ZQ., Li, J., Tian, W. (2023). Lightweight Human Pose Estimation Based on Densely Guided Self-Knowledge Distillation. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14255. Springer, Cham. https://doi.org/10.1007/978-3-031-44210-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-44210-0_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44209-4
Online ISBN: 978-3-031-44210-0
eBook Packages: Computer ScienceComputer Science (R0)