Abstract
Crowd density estimation has important practical significance for effectively suppressing the occurrence of stampede accidents. However, the crowd counting task can be easily interfered by various factors such as perspective, congestion, occlusion, density, etc., which makes accurate crowd counting a challenging task. To solve these problems, in this paper, we propose an effective hierarchical aggregation module to fuse different scale information in the network. Since the crowd counting task is seriously interfered by the surrounding environment, in this paper we propose to use attention mechanism module to weight the spatial position of the network learned feature map to effectively limit the interference of the background region to the crowd counting task. Finally, a large number of related experiments show that our model in this paper has strong generalization ability while having better performance on several public datasets compared to existing model algorithms.
This work was supported in part by the National Natural Science Foundation of China under Grants 61571382, 81671766, 61571005, 81671674, 6197136961671309 and U1605252, in part by the Fundamental Research Funds for the Central Universities under Grants 20720160075 and 20720180059, in part by the CCF-Tencent open fund, and the Natural Science Foundation of Fujian Province of China (No. 2017J01126).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: International Conference on Image Processing, pp. 900–903 (2002)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Zhang, E., Feng, C.: A fast and robust people counting method in video surveillance. In: International Conference on Computational Intelligence and Security, pp. 339–343 (2008)
Min, L., Zhang, Z., Huang, K., Tan, T.: Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: International Conference on Pattern Recognition, pp. 1–4 (2008)
Cho, S.Y., Chow, T.S., Leung, C.T.: A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans. Syst. Man Cybern. B Cybern. 29(4), 535–541 (1999)
Lempitsky, V.S., Zisserman, A.: Learning to count objects in images. In: Neural Information Processing Systems, pp. 3–16 (2017)
Cong, Z., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841 (2015)
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Computer Vision and Pattern Recognition, pp. 2403–2412 (2018)
Zhang, Y., Zhou, D., Chen, S., Gao, S., Yi, M.: Single-image crowd counting via multi-column convolutional neural network. In: Computer Vision and Pattern Recognition, pp. 589–597 (2016)
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images (2013)
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2547–2554 (2013)
Boominathan, L., Kruthiventi, S.S.S., Babu, R.V.: Crowdnet: a deep convolutional network for dense crowd counting. In: ACM Multimedia, pp. 640–644 (2016)
Oñoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_38
Marsden, M., Mcguiness, K., Little, S., O’Connor, N.E.: Fully convolutional crowd counting on highly congested scenes. In: International Conference on Computer Vision Theory and Applications, pp. 27–33 (2017)
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Computer Vision and Pattern Recognition, pp. 4031–4039 (2017)
Lu, Z., Shi, M.: Crowd counting via scale-adaptive convolutional neural network. In: Workshop on Applications of Computer Vision, pp. 1113–1121 (2017)
Ding, X., Lin, Z., He, F., Yu, W., Yue, H.: A deeply-recursive convolutional network for crowd counting. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 1942–1946 (2018)
Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., Yang, X.: Crowd counting via adversarial cross-scale consistency pursuit. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5245–5254 (2018)
Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6 (2017)
Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: International Conference on Computer Vision, pp. 1879–1888 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, H., He, F., Cheng, X., Ding, X., Huang, Y. (2019). Pay Attention to Deep Feature Fusion in Crowd Density Estimation. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-030-36808-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-36808-1_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36807-4
Online ISBN: 978-3-030-36808-1
eBook Packages: Computer ScienceComputer Science (R0)