Crowd Counting Method via Scalable Modularized Convolutional Neural Network

Abstract

Abstract: The purpose of this paper is to accurately estimate the crowd density in real scenes based on image information from arbitrary perspective and arbitrary crowd density.However,crowd counting on static images is a challenging problem.Due to the perspective distortion and the crowd crushes caused by the projection from 3D space into 2D space,it is difficult to distinguish the difference between individual and individual and the difference between individual and background.To this end,this paper proposed a flexible and efficient scalable modularized convolutional neural network (CNN) architecture.The network allows to directly input images with arbitrary size and resolution and it does not require additional computational changes in view information.Each module of the architecture employs a multiple column structure with different convolution kernels,which can be used to fit individual information of different distances.The proposed module also combines the feature information of the front and rear two layers,reducing the decrease loss of the accuracy caused by the vanishing of the gradient.Experiments show thatthe accuracy of proposed method is increased by 14.58% and 40.53%,and the root mean square error is reduced by 23.89% and 33.90% respectively on ShanghaiTech PartA and PartB datasets compared with the state-of-the-art MCNN methods.

Key words: Convolutional neural network, Crowd counting, Density maps, Feature fusion, Scalable module

CLC Number:

TP391

LI Yun-bo, TANG Si-qi, ZHOU Xing-yu, PAN Zhi-song. Crowd Counting Method via Scalable Modularized Convolutional Neural Network[J].Computer Science, 2018, 45(8): 17-21.

References

[1]LIN S F,CHEN J Y,CHAO H X.Estimation of number of people in crowded scenes using perspective transformation[J].IEEE Transactions on Systems,Man & Cybernetics Part A Systems & Humans,2001,31(6):645-654.
[2]DALAL N,TRIGGS B.Histograms of Oriented Gradients for Human Detection[C]∥IEEE Computer Society Conference on Computer Vision & Pattern Recognition.IEEE Computer Society,2005:886-893.
[3]WANG M,WANG X.Automatic adaptation of a generic pedestrian detector to a specific traffic scene[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2011:3401-3408.
[4]GE W,COLLINS R T.Marked point processes for crowd-coun-ting[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:2913-2920.
[5]IDREES H,SOOMRO K,SHAH M.Detecting Humans inDense Crowds Using Locally-Consistent Scale Prior and Global Occlusion Reasoning[M].IEEE Computer Society,2015.
[6]LIN Z,DAVIS L S.Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,32(4):604-618.
[7]LEMPITSKY V S,ZISSERMAN A.Learning To Count Objects in Images[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2010:1324-1332.
[8]ZHANG C,LI H,WANG X,et al.Cross-scene crowd counting via deep convolutional neural networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2015:833-841.
[9]WANG C,ZHANG H,YANG L,et al.Deep People Counting in Extremely Dense Crowds[C]∥ACM International Conference on Multimedia.ACM,2015:1299-1302.
[10]BOOMINATHAN L,KRUTHIVENTI S S S,BABU R V.CrowdNet:A Deep Convolutional Network for Dense Crowd Counting[C]∥Proceedings of ACM Conference on Multimedia (ACMMM) - 2016.2016:640-644.
[11]ZHANG Y,ZHOU D,CHEN S,et al.Single-Image CrowdCounting via Multi-Column Convolutional Neural Network[C]∥Computer Vision and Pattern Recognition.IEEE,2016:589-597.
[12]HAN S,POOL J,TRAN J,et al.Learning both Weights and Connections for Efficient Neural Networks[C]∥NIPS 2015.2015:1135-1143.
[13]HAN S,LIU X,MAO H,et al.EIE:Efficient Inference Engine on Compressed Deep Neural Network[C]∥ACM/IEEE International Symposium on Computer Architecture.IEEE,2016:243-254.
[14]HAN S,MAO H,DALLY W J.Deep Compression:Compressing Deep Neural Networks with Pruning,Trained Quantization and Huffman Coding[J].Fiber,2015,56(4):3-7.
[15]LIN M,CHEN Q,YAN S.Network In Network[C]∥International Conference on Learning Representations.2013.
[16]NAIR V,HINTON G E.Rectified linear units improve restric-ted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814.
[17]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Ima-ge Recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:770-778.
[18]RODRIGUEZ M,LAPTEV I,SIVIC J,et al.Density-aware person detection and tracking in crowds[C]∥International Confe-rence on Computer Vision.IEEE Computer Society,2011:2423-2430.
[19]IDREES H,SALEEMI I,SEIBERT C,et al.Multi-source Multi-scale Counting in Extremely Dense Crowd Images[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2013:2547-2554.
[20]OÑORO-RUBIO D,LÓPEZ-SASTRE R J.Towards Perspec-tive-Free Object Counting with Deep Learning[C]∥European Conference on Computer Vision.Springer,Cham,2016:615-629.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[5]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[6]	CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[7]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[8]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[9]	YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[10]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[11]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[12]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[13]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[14]	CHEN Yong-ping, ZHU Jian-qing, XIE Yi, WU Han-xiao, ZENG Huan-qiang. Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss [J]. Computer Science, 2022, 49(6A): 424-428.
[15]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!