Abstract
Human parts detection has made remarkable progress due to the development of deep convolutional networks. However, many SOTA detection methods require large computational cost and are still difficult to be deployed to edge devices with limited computing resources. In this paper, we propose a lightweight Cascade Center-based Framework, called CCF-Net, for human parts detection. Firstly, a Gaussian-Induced penalty strategy is designed to ensure that the network can handle objects of various scales. Then, we use Cascade Attention Module to capture relations between different feature maps, which refines intermediate features. With our novel cross-dataset training strategy, our framework fully explores the datasets with incomplete annotations and achieves better performance. Furthermore, Center-based Knowledge Distillation is proposed to enable student models to learn better representation without additional cost. Experiments show that our method achieves a new SOTA performance on Human-Parts and COCO Human Parts benchmarks(The Datasets used in this paper were downloaded and experimented on by Kai Ye from Shenzhen University.).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bochkovskiy, A., et al.: Yolov4: optimal speed and accuracy of object detection. arXiv (2020)
Dai, J., et al.: R-FCN: Object detection via region-based fully convolutional networks. In: NIPS (2016)
Guo, J., et al.: Distilling object detectors via decoupled features. In: CVPR (2021)
He, K., et al.: Deep residual learning for image recognition. In: CVPR (2016)
Hinton, G., et al.: Distilling the knowledge in a neural network. arXiv (2015)
Howard, A., et al.: Searching for mobilenetv3. In: ICCV (2019)
Kim, K., et al.: Probabilistic anchor assignment with IoT prediction for object detection. In: ECCV (2020)
Kong, T., et al.: Foveabox: Beyound anchor-based object detection. IEEE Trans. Image Process. (99):1-1 (2020)
Li, X., et al.: Detector-in-detector: multi-level analysis for human-parts. In: ACCV (2018)
Lin, T.Y., et al.: Feature pyramid networks for object detection. In: CVPR (2017)
Lin, T.Y., et al.: Focal loss for dense object detection. In: ICCV (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., et al.: You only look once: Unified, real-time object detection. In: CVPR (2016)
Redmon, J., et al.: Yolov3: An incremental improvement. arXiv (2018)
Ren, et al.: Faster r-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Rezatofighi, H., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: CVPR (2019)
Tian, Z., et al.: Fcos: fully convolutional one-stage object detection. In: ICCV (2019)
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Wang, T., Yuan, L., Zhang, X., Feng, J.: Distilling object detectors with fine-grained feature imitation. In: CVPR (2019)
Yang, L., et al.: HIER R-CNN: instance-level human parts detection and a new benchmark. Trans. I. Process 30, 39–54 (2020)
Yang, S., et al.: Wider face: A face detection benchmark. In: CVPR (2016)
Yang, Z., et al.: Reppoints: point set representation for object detection. In: ICCV (2019)
Yao, Y., et al.: Cross-dataset training for class increasing object detection. arXiv (2020)
Zhang, L., et al.: Improve object detection with feature-based knowledge distillation: towards accurate and efficient detectors. In: ICLR (2020)
Zhang, S., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR (2020)
Zhang, S., et al.: Distribution alignment: A unified framework for long-tail visual recognition. In: CVPR (2021)
Zhixing, D., et al.: Distilling object detectors with feature richness. In: NIPS (2021)
Zhou, X., et al.: Objects as points. arXiv (2019)
Zhu, B., et al.: Autoassign: Differentiable label assignment for dense object detection. arXiv (2020)
Zhu, C., et al.: Feature selective anchor-free module for single-shot object detection. In: CVPR (2019)
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant 91959108, and Shenzhen Municipal Science and Technology Innovation Council under Grant JCYJ20220531101412030. We thank Qualcomm Incorporated to support us.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ye, K., Ji, H., Li, Y., Wang, L., Liu, P., Shen, L. (2023). CCF-Net: A Cascade Center-Based Framework Towards Efficient Human Parts Detection. In: Dang-Nguyen, DT., et al. MultiMedia Modeling. MMM 2023. Lecture Notes in Computer Science, vol 13834. Springer, Cham. https://doi.org/10.1007/978-3-031-27818-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-27818-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-27817-4
Online ISBN: 978-3-031-27818-1
eBook Packages: Computer ScienceComputer Science (R0)