Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals

Published: 04 March 2022 Publication History

Abstract

A recent line of research focuses on crowd density estimation from RGB images for a variety of applications, for example, surveillance and traffic flow control. The performance drops dramatically for low-quality images, such as occlusion, or poor light conditions. However, people are equipped with various wireless devices, allowing the received signals to be easily collected at the base station. As such, another line of research utilizes received signals for crowd counting. Nevertheless, received signals offer only information regarding the number of people, while an accurate density map cannot be derived. As unmanned aerial vehicles (UAVs) are now treated as flying base stations and equipped with cameras, we make the first attempt to leverage both RGB images and received signals for crowd density estimation on UAVs. Specifically, we propose a novel network to effectively fuse the RGB images and received signal strength (RSS) information. Moreover, we design a new loss function that considers the uncertainty from RSS and makes the prediction consistent with the received signals. Experimental results show that the proposed method successfully helps break the limit of traditional crowd density estimation methods and achieves state-of-the-art performance. The proposed dataset is released as a public download for future research.

References

[1]
Jeffrey G. Andrews, Stefano Buzzi, Wan Choi, Stephen Hanly, Angel Lozano, Anthony C. K. Soong, and Jianzhong Charlie Zhang. 2014. What will 5G be? IEEE Journal on Selected Areas in Communications 32, 6 (2014), 1065–1082.
[2]
Anas Basalamah. 2016. Automatic Update of Crowd and Traffic Data Using Device Monitoring. (Jul 2016). US Patent 9,401,086.
[3]
Jack Bresenham. 1965. Algorithm for computer control of a digital plotter. IBM Systems Journal 4, 1 (1965), 25–30.
[4]
Xinkun Cao, Zhipeng Wang, Yanyun Zhao, and Fei Su. 2020. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’18), Munich, Germany. Springer, 757–773.
[5]
Antoni B. Chan and Nuno Vasconcelos. 2012. Counting people with low-level features and Bayesian regression. IEEE Transactions on Image Processing 21, 4 (April 2012), 2160–2177.
[6]
Jiwei Chen, Wen Su, and Zengfu Wang. 2020. Crowd counting with crowd attention convolutional neural network. Neurocomputing 382 (2020), 210–220. DOI:
[7]
Zhi-Qi Cheng, Jun-Xiu Li, Qi Dai, Xiao Wu, and Alexander G. Hauptmann. 2019. Learning spatial awareness to improve crowd counting. In Proceedings of the IEEE Conference on Computer Vision (ICCV’19), Seoul, Korea (South). 6151–6160.
[8]
Zhi-Qi Cheng, Jun-Xiu Li, Qi Dai, Xiao Wu, Jun-Yan He, and Alexander G. Hauptmann. 2019. Improving the learning of multi-column convolutional neural network for crowd counting. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19) Nice, France. Association for Computing Machinery, 1897–1906.
[9]
Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, Vol. 1. 886–893.
[10]
Simone Di Domenico, Mauro De Sanctis, Ernestina Cianca, and Giuseppe Bianchi. 2016. A trained-once crowd counting method using differential WiFi channel state information. In Proceedings of the 3rd International Workshop on Physical Analytics (WPA’16), Singapore. 37–42.
[11]
Piotr Dollár, Christian Wojek, Bernt Schiele, and Pietro Perona. 2012. Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 4 (April 2012), 743–761.
[12]
Huiyuan Fu, Huadong Ma, and Hongtian Xiao. 2014. Crowd counting via head detection and motion flow estimation. In Proceedings of the 22nd ACM International Conference on Multimedia (MM’14) Orlando, FL, USA. ACM, 877–880.
[13]
Junyu Gao, Qi Wang, and Yuan Yuan. 2019. SCAR: Spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing 363 (October 2019), 1–8.
[14]
Dan Guo, Kun Li, Zheng-Jun Zha, and Meng Wang. 2019. DADNet: Dilated-attention-deformable ConvNet for crowd counting. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19) Nice, France. ACM, 1823–1832.
[15]
Marcus Handte, Muhammad Umer Iqbal, Stephan Wagner, Wolfgang Apolinarski, Pedro Marrón, Eva Maria Muñoz Navarro, Santiago Martinez, Sara Izquierdo Barthelemy, and Mario G. Fernández. 2014. Crowd density estimation for public transport vehicles. In Proceedings of the International Conference on Extending Database Technology/International Conference on Database Theory (EDBT/ICDT’14), Joint Conference, Athens, Greece. CEUR-WS.org, 315–322.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2014. Spatial pyramid pooling in deep convolutional networks for visual recognition. In Proceedings of the European Conference on Computer Vision (ECCV’14), Zurich, Switzerland. Springer, 346–361.
[17]
Yaocong Hu, Huan Chang, Fudong Nian, Yan Wang, and Teng Li. 2016. Dense crowd counting from still images with convolutional neural networks. Journal of Visual Communication and Image Representation 38 (2016), 530–539. DOI:
[18]
Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), Portland, OR, USA. IEEE Computer Society, 2547–2554.
[19]
Haroon Idrees, Khurram Soomro, and Mubarak Shah. 2015. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 10 (2015), 1986–1998.
[20]
X. Jiang, L. Zhang, M. Xu, T. Zhang, P. Lv, B. Zhou, X. Yang, and Y. Pang. 2020. Attention scaling for crowd counting. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20), Seattle, WA, USA. Computer Vision Foundation, 4705–4714.
[21]
Wahab Khawaja, Ismail Guvenc, David Matolak, Uwe-Carsten Fiebig, and Nicolas Schneckenburger. 2019. A survey of air-to-ground propagation channel modeling for unmanned aerial vehicles. IEEE Communications Surveys Tutorials 21, 3 (2019), 2361–2391.
[22]
Mehmet Kemal Kocamaz, Jian Gong, and Bernardo R. Pires. 2016. Vision-based counting of pedestrians and cyclists. In IEEE Winter Conference on Applications of Computer Vision (WACV’16). IEEE Computer Society, 1–8. DOI:
[23]
C. Lai, L. Wang, and Z. Han. 2019. Data-driven 3D placement of UAV base stations for arbitrarily distributed crowds. In 2019 IEEE Global Communications Conference (GLOBECOM’19) Waikoloa, HI, USA. IEEE, 1–6. DOI:
[24]
Wei-Cheng Lai, Zi-Xiang Xia, Hao-Siang Lin, Lien-Feng Hsu, Hong-Han Shuai, I-Hong Jhuo, and Wen-Huang Cheng. 2020. Trajectory prediction in heterogeneous environment via attended ecology embedding. In Proceedings of the ACM International Conference on Multimedia Virtual Event/Seattle, WA, USA. ACM, 202–210.
[25]
Teng Li, Huan Chang, Meng Wang, Bingbing Ni, Richang Hong, and Shuicheng Yan. 2015. Crowded scene analysis: A survey. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT’15) 25, 3 (2015), 367–386. DOI:
[26]
Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18) Salt Lake City, UT, USA. IEEE Computer Society, 1091–1100.
[27]
Dongze Lian, Jing Li, Jia Zheng, Weixin Luo, and Shenghua Gao. 2019. Density map regression guided detection network for RGB-D crowd counting and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 1821–1830.
[28]
Chuanbin Liu, Hongtao Xie, Zhengjun Zha, Lingyun Yu, Zhineng Chen, and Yongdong Zhang. 2020. Bidirectional attention-recognition model for fine-grained object classification. IEEE Transactions on Multimedia 22, 7 (2020), 1785–1795. DOI:
[29]
Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, and Arnt-Børre Salberg. 2019. Dense dilated convolutions merging network for semantic mapping of remote sensing images. In Proceedings of Joint Urban Remote Sensing Event (JURSE’19) Vannes, France. IEEE, 1–4.
[30]
Weizhe Liu, Krzysztof Maciej Lis, Mathieu Salzmann, and Pascal Fua. 2019. Geometric and physical constraints for drone-based head plane crowd density estimation. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’19), Macau, SAR, China. IEEE, 244–249.
[31]
Weizhe Liu, Mathieu Salzmann, and Pascal Fua. 2019. Context-aware crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 5094–5103.
[32]
Xiyang Liu, Jie Yang, and Wenrui Ding. 2020. Adaptive mixture regression network with local counting map for crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 241–257.
[33]
Yan Liu, Lingqiao Liu, Peng Wang, Pingping Zhang, and Yinjie Lei. 2020. Semi-supervised crowd counting via self-training on surrogate tasks. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 242–259.
[34]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), Boston, MA, USA. IEEE Computer Society, 3431–3440.
[35]
Yu-Jen Ma, Hong-Han Shuai, and Wen-Huang Cheng. 2021. Spatiotemporal dilated convolution with uncertain matching for video-based crowd estimation. IEEE Transactions on Multimedia (2021), 1–1. DOI:
[36]
Zheng Ma and Antoni B. Chan. 2013. Crossing the line: Crowd counting by integer programming with local features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), Portland, OR, USA. IEEE Computer Society, 2539–2546.
[37]
Yaik Ooi, Kong Zan Wai, Ian Tan, and Ooi Boon Sheng. 2016. Measuring the accuracy of crowd counting using WiFi probe-request-frame counting technique. Journal of Telecommunication, Electronic and Computer Engineering 8, 2 (2016), 79–81.
[38]
Xingang Pan, Jianping Shi, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2017. Spatial As Deep: Spatial CNN for Traffic Scene Understanding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’18), New Orleans, Louisiana, USA. AAAI Press, 7276–7283.
[39]
David Ryan, Simon Denman, Clinton Fookes, and Sridha Sridharan. 2009. Crowd counting using multiple local features. In Proceedings of the Digital Image Computing: Techniques and Applications (DICTA’09) Melbourne, Australia. IEEE Computer Society, 81–88.
[40]
Deepak Babu Sam, Shiv Surya, and R. Venkatesh Babu. 2017. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17) Honolulu, HI, USA. IEEE Computer Society, 4031–4039.
[41]
Miaojing Shi, Zhaohui Yang, Chao Xu, and Qijun Chen. 2019. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). Long Beach, CA, USA. Computer Vision Foundation, 7279–7288.
[42]
Kyosuke Shibata and Hiroshi Yamamoto. 2019. People crowd density estimation system using deep learning for radio wave sensing of cellular communication. In Proceedings of the International Conference on Artificial Intelligence in Information and Communication (ICAIIC’19) Okinawa, Japan. IEEE, 143–148.
[43]
Vishwanath A. Sindagi, Rajeev Yasarla, Deepak Sam Babu, R. Venkatesh Babu, and Vishal M. Patel. 2020. Learning to count in the crowd from limited labeled data. In Proceedings of the European Conference on Computer Vision (ECCV’20), Glasgow, UK. Springer, 212–229.
[44]
Chon Hou Sio, Yu-Jen Ma, Hong-Han Shuai, Jun-Cheng Chen, and Wen-Huang Cheng. 2020. S2SiamFC: Self-supervised fully convolutional Siamese network for visual tracking. In Proceedings of the ACM International Conference on Multimedia, Virtual Event / Seattle, WA, USA. ACM, 1948–1957.
[45]
Russell Stewart, Mykhaylo Andriluka, and Andrew Yan-Tak Ng. 2016. End-to-end people detection in crowded scenes. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16), Las Vegas, NV, USA. IEEE Computer Society, 2325–2333.
[46]
Gordon L. Stüber. 2017. Principles of Mobile Communication (4th ed.). Springer, Cham.
[47]
Xin Tan, Chun Tao, Tongwei Ren, Jinhui Tang, and Gangshan Wu. 2019. Crowd counting via multi-layer regression. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19), Nice, France. ACM, 1907–1915.
[48]
Yukun Tian, Yiming Lei, Junping Zhang, and James Ze Wang. 2019. PaDNet: Pan-density crowd counting. IEEE Transactions on Image Processing 29 (November 2019), 2714–2727.
[49]
Haijun Wang, Haitao Zhao, Weiyu Wu, Jun Xiong, Dongtang Ma, and Jibo Wei. 2019. Deployment algorithms of flying base stations: 5G and beyond with UAVs. In IEEE Internet of Things Journal 6, 6 (2019), 10009–10027.
[50]
Qi Wang, Junyu Gao, Wei Lin, and Yuan Yuan. 2019. Learning from synthetic data for crowd counting in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19), Long Beach, CA, USA. Computer Vision Foundation, 8198–8207.
[51]
Shuheng Wang, Hanli Wang, and Qinyu Li. 2019. Multi-dilation network for crowd counting. In Proceedings of the ACM Multimedia Asia (MMAsia’19) Beijing, China. Association for Computing Machinery, Article 56, 1–6.
[52]
Zi-Xiang Xia, Wei-Cheng Lai, Li-Wu Tsao, Lien-Feng Hsu, Chih-Chia Hu Yu, Hong-Han Shuai, and Wen-Huang Cheng. 2020. Human-like traffic scene understanding system: A survey. IEEE Industrial Electronics Magazine 15, 1 (2020), 6–15.
[53]
Peng Yu, Wenjing Li, Fanqin Zhou, Lei Feng, Mengjun Yin, Shaoyong Guo, Zhipeng Gao, and Xuesong Qiu. 2018. Capacity enhancement for 5G networks using MmWave aerial base stations: Self-organizing architecture and approach. IEEE Wireless Communications 25, 4 (August 2018), 58–64.
[54]
Anran Zhang, Xiaolong Jiang, and Xianbin Cao Baochang Zhang. 2020. Multi-scale supervised attentive encoder-decoder network for crowd counting. ACM Transactions on Multimedia Computing, Communications, and Applications Article 28, 16, 1 (April 2020).
[55]
H. Zhang, L. Song, and Z. Han. 2020. Unmanned Aerial Vehicle Applications Over Cellular Networks for 5G and Beyond. Springer.
[56]
Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16) Las Vegas, NV, USA. IEEE, 589–597.
[57]
Zhaoxiang Zhang, Mo Wang, and Xin Geng. 2015. Crowd counting in public video surveillance by label distribution learning. Neurocomputing 166 (Oct. 2015), 151–163.
[58]
Rui Zhou, Xiang Lu, Yang Fu, and Mingjie Tang. 2020. Device-free crowd counting with WiFi channel state information and deep neural networks. Wireless Networks 26, 5 (2020), 3495–3506. DOI:
[59]
Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Qinghua Hu, and Haibin Ling. 2020. Vision Meets Drones: Past, Present and Future. (2020). arxiv:2001.06303

Cited By

View all
  • (2024)Enhancing trust transfer in supply chain finance: a blockchain-based transitive trust modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00557-w13:1Online publication date: 2-Jan-2024
  • (2024)DAG-YOLO: A Context-feature Adaptive Fusion Rotating Detection Network in Remote Sensing ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3674978Online publication date: 27-Jun-2024
  • (2024)Learning Offset Probability Distribution for Accurate Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363721420:5(1-24)Online publication date: 22-Jan-2024
  • Show More Cited By

Index Terms

  1. Improving Crowd Density Estimation by Fusing Aerial Images and Radio Signals

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 18, Issue 3
    August 2022
    478 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3505208
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 March 2022
    Accepted: 01 October 2021
    Revised: 01 September 2021
    Received: 01 February 2021
    Published in TOMM Volume 18, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Crowd density estimation
    2. unmanned aerial vehicles
    3. data fusion
    4. datasets

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • Ministry of Science and Technology (MOST) of Taiwan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)81
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 26 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Enhancing trust transfer in supply chain finance: a blockchain-based transitive trust modelJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-023-00557-w13:1Online publication date: 2-Jan-2024
    • (2024)DAG-YOLO: A Context-feature Adaptive Fusion Rotating Detection Network in Remote Sensing ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3674978Online publication date: 27-Jun-2024
    • (2024)Learning Offset Probability Distribution for Accurate Object DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363721420:5(1-24)Online publication date: 22-Jan-2024
    • (2024)Efficient crowd density estimation with edge intelligence via structural reparameterization and knowledge transferApplied Soft Computing10.1016/j.asoc.2024.111366154:COnline publication date: 2-Jul-2024
    • (2024)Introduction6G Enabled Healthcare Systems10.1007/978-3-031-73849-4_1(1-12)Online publication date: 13-Nov-2024
    • (2023)Sparsity-guided Discriminative Feature Encoding for Robust Keypoint DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362843220:3(1-22)Online publication date: 17-Oct-2023
    • (2023)Boosting Few-shot Object Detection with Discriminative Representation and Class MarginACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360847820:3(1-19)Online publication date: 10-Nov-2023
    • (2023)Pseudo Object Replay and Mining for Incremental Object DetectionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611952(153-162)Online publication date: 26-Oct-2023
    • (2023)Distilled Meta-learning for Multi-Class Incremental LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357604519:4(1-16)Online publication date: 15-Mar-2023
    • (2023)When Object Detection Meets Knowledge Distillation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.325754645:8(10555-10579)Online publication date: 1-Aug-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media