Abstract
The accuracy of crowd counting is susceptible to scale variations of crowd head in the congested scene. Some counting networks, such as crowd density pre-classification networks or multi-column counting networks, are proposed to model the different scales of crowd head. However, most of them own a complex network structure with many network parameters, making deploying a crowd counting network in practical application scenarios challenging. To this end, we propose a lightweight crowd counting network termed PDDNet. The front-end of the PDDNet chooses the first 13 layers of GhostNet to extract the crowd feature, and the back-end of the PDDNet is implemented with the proposed lightweight pyramidal convolution modules (LPC) to extract the multi-scale features. Finally, the extracted multi-scale features are fed to transposed convolution layers to regress the final crowd density map. We conduct extensive experiments on the commonly-used crowd counting datasets, i.e., ShanghaiTech, UCF_QNRF, and NWPU_Crowd. The experiment results show the superiority of our model compared with state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Zeng X, Wu Y, Hu S, Wang R, Ye Y (2020) Dspnet: deep scale purifier network for dense crowd counting. Expert Syst Appl 141:112977
Sam DB, Babu RV (2018) Top-down feedback for crowd counting convolutional neural network. In: Thirty-second AAAI conference on artificial intelligence
Cheng Z-Q, Li J-X, Dai Q, Wu X, He J-Y, Hauptmann AG (2019) Improving the learning of multi-column convolutional neural network for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1897– 1906
Hossain M, Hosseinzadeh M, Chanda O, Wang Y (2019) Crowd counting using scale-aware attention networks. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 1280–1288
Zhang L, Shi M, Chen Q (2018) Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 1113–1121
Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the european conference on computer vision (ECCV), pp 734–750
Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng M-M, Zheng G (2018) Crowd counting with deep negative correlation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5382–5390
Jiang X, Xiao Z, Zhang B, Zhen X, Cao X, Doermann D, Shao L (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6133–6142
Li Y, Zhang X, Chen D (2018) Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100
Guo D, Li K, Zha Z-J, Wang M (2019) Dadnet: dilated-attention-deformable convnet for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1823–1832
Yan Z, Yuan Y, Zuo W, Tan X, Wang Y, Wen S, Ding E (2019) Perspective-guided convolution networks for crowd counting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 952–961
Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5099–5108
Ma Z, Wei X, Hong X, Gong Y (2019) Bayesian loss for crowd count estimation with point supervision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6142–6151
Xie Y, Lu Y, Wang S (2020) Rsanet: deep recurrent scale-aware network for crowd counting. In: IEEE international conference on image processing, pp 1531–1535
Liang L, Zhao H, Zhou F, Zhang Q, Song Z, Shi Q (2022) Sc2net: scale-aware crowd counting network with pyramid dilated convolution. Appl Intell:1–14
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd International conference on learning representations, ICLR
Sindagi VA, Patel VM (2019) Ha-ccn: hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335
Wang S, Lu Y, Zhou T, Di H, Lu L, Zhang L (2020) Sclnet: spatial context learning network for congested crowd counting. Neurocomputing 404:227–239
Chu H, Tang J, Hu H (2021) Attention guided feature pyramid network for crowd counting. J Vis Commun Image Represent 80:103319
Amirgholipour S, Jia W, Liu L, Fan X, Wang D, He X (2021) Pdanet: pyramid density-aware attention based network for accurate crowd counting. Neurocomputing 451:215–230
Liu Y-B, Jia R-S, Liu Q-M, Zhang X-L, Sun H-M (2021) Crowd counting method based on the self-attention residual network. Appl Intell 51(1):427–440
Gu L, Pang C, Zheng Y, Lyu C, Lyu L (2022) Context-aware pyramid attention network for crowd counting. Appl Intell 52(6):6164–6180
Shi Y, Sang J, Wu Z, Wang F, Liu X, Xia X, Sang N (2022) Mgsnet: a multi-scale and gated spatial attention network for crowd counting. Appl Intell:1–11
Li Y-C, Jia R-S, Hu Y-X, Han D-N, Sun H-M (2022) Crowd density estimation based on multi scale features fusion network with reverse attention mechanism. Appl Intell:1–17
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597
Sindagi VA, Patel VM (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6
Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254
Gao J, Wang Q, Li X (2019) Pcc net: perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30(10):3486–3498
Shi X, Li X, Wu C, Kong S, Yang J, He L (2020) A real-time deep network for crowd counting. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2328–2332
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the european conference on computer vision (ECCV), pp 116–131
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1580–1589
Topkaya IS, Erdogan H, Porikli F (2014) Counting people by clustering person detector outputs. In: 2014 11th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 313–318
Li Z, Zhang L, Fang Y, Wang J, Xu H, Yin B, Lu H (2016) Deep people counting with faster r-cnn and correlation tracking. In: Proceedings of the international conference on internet multimedia computing and service, pp 57–60
Babu Sam D, Surya S, Venkatesh Babu R (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5744–5752
Tian Y, Lei Y, Zhang J, Wang JZ (2019) Padnet: pan-density crowd counting. IEEE Trans Image Process 29:2714–2727
Bai S, He Z, Qiao Y, Hu H, Wu W, Yan J (2020) Adaptive dilated network with self-correction supervision for counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4594–4603
Wang W, Liu Q, Wang W (2022) Pyramid-dilated deep convolutional neural network for crowd counting. Appl Intell 52(2):1825–1837
Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE conference on computer vision, pp 1861–1870
Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546
Laradji IH, Rostamzadeh N, Pinheiro PO, Vazquez D, Schmidt M (2018) Where are the blobs: counting by localization with point supervision. In: Proceedings of the european conference on computer vision (ECCV), pp 547–562
Liu C, Weng X, Mu Y (2019) Recurrent attentive zooming for joint crowd counting and precise localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1217–1226
Chen X, Yu X, Di H, Wang S (2021) Sa-internet: scale-aware interaction network for joint crowd counting and localization. In: Chinese conference on pattern recognition and computer vision, pp 203–215
Zhou T, Wang S, Zhou Y, Yao Y, Li J, Shao L (2020) Motion-attentive transition for zero-shot video object segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 13066–13073
Zhou T, Li J, Wang S, Tao R, Shen J (2020) Matnet: motion-attentive transition network for zero-shot video object segmentation. IEEE Trans Image Process 29:8326–8338
Zhou T, Li L, Li X, Feng C-M, Li J, Shao L (2021) Group-wise learning for weakly supervised semantic segmentation. IEEE Trans Image Process 31:799–811
Lai Q, Zhou T, Khan S, Sun H, Shen J, Shao L (2022) Weakly supervised visual saliency prediction. IEEE Trans Image Process 31:3111–3124
Zhang S, Zhang X, Li H, He H, Song D, Wang L (2022) Hierarchical pyramid attentive network with spatial separable convolution for crowd counting. Eng Appl Artif Intell 108:104563
Song Q, Wang C, Wang Y, Tai Y, Wang C, Li J, Wu J, Ma J (2021) To choose or to fuse? scale selection for crowd counting. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 2576–2583
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ma J, Dai Y, Tan Y-P (2019) Atrous convolutions spatial pyramid network for crowd counting and density estimation. Neurocomputing 350:91–101
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Wu B, Wan A, Yue X, Jin P, Zhao S, Golmant N, Gholaminejad A, Gonzalez J, Keutzer K (2018) Shift: a zero flop, zero parameter alternative to spatial convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9127–9135
Yang Z, Wang Y, Liu C, Chen H, Xu C, Shi B, Xu C, Xu C (2019) Legonet: efficient convolutional neural networks with lego filters. In: International conference on machine learning. PMLR, pp 7005–7014
Wang W, Yu Z, Fu C, Cai D, He X (2021) Cop: customized correlation-based filter level pruning method for deep cnn compression. Neurocomputing 464:533–545
He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1389–1397
Wan D, Shen F, Liu L, Zhu F, Huang L, Yu M, Shen HT, Shao L (2020) Deep quantization generative networks. Pattern Recogn 105:107338
Chen H, Wang Y, Xu C, Yang Z, Liu C, Shi B, Xu C, Xu C, Tian Q (2019) Data-free learning of student networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3514–3522
Liu L, Chen J, Wu H, Chen T, Li G, Lin L (2020) Efficient crowd counting via structured knowledge transfer. In: Proceedings of the 28th ACM international conference on multimedia, pp 2645–2654
Wang S, Zhou T, Lu Y, Di H (2021) Contextual transformation network for lightweight remote-sensing image super-resolution. IEEE Trans Geosci Remote Sens 60:1–13
Paoletti ME, Haut JM, Pereira NS, Plaza J, Plaza A (2021) Ghostnet for hyperspectral image classification. IEEE Trans Geosci Remote Sens 59(12):10378–10393
Kazerouni IA, Dooly G, Toal D (2021) Ghost-unet: an asymmetric encoder-decoder architecture for semantic segmentation from scratch. IEEE Access 9:97457–97465
Chen X, Bin Y, Sang N, Gao C (2019) Scale pyramid network for crowd counting. In: Winter conference on applications of computer vision, pp 1941–1950
Wang Q, Gao J, Lin W, Li X (2020) Nwpu-crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans Pattern Anal Mach Intell 43(6):2141–2149
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liang, L., Zhao, H., Zhou, F. et al. PDDNet: lightweight congested crowd counting via pyramid depth-wise dilated convolution. Appl Intell 53, 10472–10484 (2023). https://doi.org/10.1007/s10489-022-03967-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03967-6