research-article

Multi-Object Tracking based on RGB-D Sensors

Authors:

Tianzhong Zhang,

Liangfeng ChenAuthors Info & Claims

icWCSN '23: Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks

Pages 131 - 136

https://doi.org/10.1145/3585967.3585990

Published: 19 April 2023 Publication History

Abstract

The accuracy of the multi-object tracking (MOT) based on the 2D camera without depth info is usually poor. In this paper, we propose a MOT method based on sensors composed of the camera and the ultra-wide band (UWB) radar, which are similar to the depth camera (RGB-D camera). First, we establish a backbone network to extract feature maps from video frames captured by a camera. Then, we combine Faster R-CNN with a re-ID branch to detect objects including the category, coordinate and ID. To track objects, we construct a similarity matrix to calculate the data association between the objects and their historical trajectories. The matrix's elements are calculated by the intersection over union (IoU) between the objects and their related two types of trajectories, which are based on the image data and the UWB localization data separately. Finally, the trajectories are updated by the two types of trajectories, and the recognition network is updated by the localization loss. The experimental results show that our method achieves multi-object recognition and tracking, and outperforms previous methods by a large margin on several public datasets.

References

[1]

P. L. Li, T. Qin and S. J. Shen, “Stereo vision-based semantic 3D object and ego-motion tracking for autonomous driving,” Proc. of European Conf. on Computer Vision (ECCV), Munich, Germany, pp. 646-661, 2018.

Digital Library

[2]

Y. F. Li, W. Ren, T. Q. Zhu, Y. Ren, Y. Qin, and W. Jie, “RIMS: A real-time and intelligent monitoring system for live-broadcasting platforms,” Futur. Gener. Comp. Syst., Vol. 87, pp. 259-266, 2018.

Digital Library

[3]

S. Verma, Y. H. Eng, H. X. Kong, H. Andersen, M. Meghjani, “Vehicle detection, tracking and behavior analysis in urban driving environments using road context,” Proc. of 2018 IEEE Int. Conf. on Robotics and Automation (ICRA), Brisbane, Australia, pp. 20-25, 2018.

Digital Library

[4]

J. Zhang, S. L. Xu, and F. Deng, “Design and Implementation of Intelligent Event-Driven Human-Computer Interface on Vehicles,” J. Adv. Comput. Intell. Intell. Inform., Vol.19, No.2, pp. 247-254, 2015.

[5]

Z. Jiang and D. Q. Huynh, “Multiple pedestrian tracking from videos in an interacting multiple model framework,” IEEE Trans. on Image Process, Vol. 27, pp. 1361-1375, 2018.

[6]

C. Zimmermann and T. Brox, “Learning to estimate 3d hand pose from single RGB images,” Proc. of 2017 IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, pp. 4903-4911, 2017.

[7]

X. Zhao, F. Pu, Z. Wang, H. Chen and Z. Xu, “Detection, tracking, and geolocation of moving vehicle from UAV using camera,” IEEE Access, Vol. 7, pp. 101160-101170, 2019.

[8]

S. Sridhar, A. Oulasvirta and C. Theobalt, “Interactive markerless articulated hand motion tracking using RGB and depth data,” Proc. of 2013 IEEE Int. Conf. on Computer Vision (ICCV), Sydney, Australia, pp. 2456-2463, 2013.

Digital Library

[9]

P. Li, T. Qin and S. Shen, “Stereo vision-based semantic 3d object and ego-motion tracking for autonomous driving,” Proc. of European Conf. on Computer Vision (ECCV), pp. 646-661, 2018.

Digital Library

[10]

T. Dieterle, F. Particke, L. Patino-Studencki and J. Thielecke, “Sensor data fusion of LIDAR with stereo RGB-D camera for object tracking,” 2017 IEEE Sensors, Glasgow, UK, pp. 1-3, 2017.

[11]

H. N. Hu, Q. Z. Cai, D. Wang, J. Lin, M. Sun, “Joint 3D vehicle detection and tracking,” Proc. of 2019 IEEE Int. Conf. on Computer Vision (ICCV), Seoul, Korea, pp. 5390-5399, 2019.

[12]

R. Mur-Artal and J. D. Tardos, “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Trans. on Robotics, Vol. 33, No. 5, pp. 1255-1262, 2017.

Digital Library

[13]

C. Jing, J. Potgieter, F. Noble and R. Wang, “A comparison and analysis of RGB-D cameras’ depth performance for robotics application,” Proc. of 2017 24th Int. Conf. on Mechatronics and Machine Vision in Practice (M2VIP), Auckland, New Zealand, pp. 1-6, 2017.

[14]

K. Chen, Y. K. Lai and S. M. Hu, “3D Indoor scene modeling from RGB-D data: A survey,” Comput. Vis. Media, Vol. 1, pp. 267-278, 2015.

[15]

K. Han, “Image object tracking based on temporal context and MOSSE,” Cluster Comput., Vol. 20, pp. 1259-1269, 2017.

Digital Library

[16]

T. Zhang, C. Xu and M. H. Yang, “Multi-task correlation particle filter for robust object tracking,” Proc. of 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 4819-4827, 2017.

[17]

J. F. Henriques, R. Caseiro, P. Martins and J. Batista, “High-speed tracking with kernelized correlation filters,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 37, No. 3, pp. 583-596, 2015.

Digital Library

[18]

S. E. Li, G. Li, J. Yu, C. Liu, B. Cheng, “Kalman filter-based tracking of moving objects using linear ultrasonic sensor array for road vehicles,” Mech. Syst. Signal Proc., Vol. 98, pp. 173-189, 2018.

[19]

L. Bertinetto, J. Valmadre, S. Golodetz, O. Miksik and P. H. Torr, “Staple: Complementary learners for real-time tracking,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 1401-1409, 2016.

[20]

N. S. Mahmoudi, M. Ahadi and M. Rahmati, “Multi-object tracking using CNN-based features: CNNMTT,” Multimed. Tools Appl., Vol. 78, pp. 7077-7096, 2019.

Digital Library

[21]

L. Chen, H. Ai, Z. Zhuang and C. Shang, “Real-time multiple people tracking with deeply learned candidate selection and person re-identification,” Proc. of 2018 IEEE Int. Conf. on Multimedia and Expo (ICME), San Diego, CA, USA, pp. 1-6, 2018.

[22]

Y. Xu, Y. Ban, X. Alameda-Pineda and R. Horaud, “DeepMOT: A differentiable framework for training multiple object trackers,” ArXiv Preprint ArXiv:1906.06618, 2019.

[23]

J. Chen, H. Sheng, Y. Zhang and Z. Xiong, “Enhancing detection model for multiple hypothesis tracking,” Proc. of 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, pp. 2143-2152, 2017.

[24]

R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. of 2014 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, pp. 580-587, 2014.

Digital Library

[25]

R. Girshick, “Fast R-CNN,” Proc. of 2015 IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, pp. 1440-1448, 2015.

Digital Library

[26]

S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, pp. 1137-1149, 2017.

Digital Library

[27]

K. He, G. Gkioxari, P. Dollár and R. Girshick, “Mask R-CNN,” Proc. of 2017 IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, pp. 22-29, 2017.

[28]

P. Voigtlaender, M. Krause, A. Osep, J. Luiten, B. B. G. Sekar, “MOTS: Multi-object tracking and segmentation,” Proc. of 2019 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 7942-7951, 2019.

[29]

P. Bergmann, T. Meinhardt and L. Leal-Taixe, “Tracking without bells and whistles,” Proc. of 2019 IEEE Int. Conf. on Computer Vision (ICCV), Seoul, Korea, pp. 941-951, 2019.

[30]

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ArXiv Preprint ArXiv:1409.1556, 2014.

[31]

T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, “Microsoft coco: Common objects in context,” Proc. of European Conf. on Computer Vision (ECCV), Zurich, Switzerland, pp. 740-755, 2014.

[32]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” ArXiv Preprint ArXiv:1412.6980, 2014.

[33]

A. Milan, L. Leal-Taixé, I. Reid, S. Roth and K. Schindler, “MOT16: A benchmark for multi-object tracking,” ArXiv Preprint ArXiv:1603.00831, 2016.

[34]

P. Felzenszwalb, R. Girshick, D. McAllester and D. Ramanan, “Object detection with discriminatively trained part-based models,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, pp. 1627-1645, 2010.

Digital Library

[35]

F. Yang, W. Choi and Y. Lin, “Exploit all the layers: Fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers,” Proc. of 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2129-2137, 2016.

Cited By

Wang XSun ZChehri AJeon GSong Y(2024)Deep learning and multi-modal fusion for real-time multi-object tracking: Algorithms, challenges, datasets, and comparative studyInformation Fusion10.1016/j.inffus.2024.102247105(102247)Online publication date: May-2024
https://doi.org/10.1016/j.inffus.2024.102247

Index Terms

Multi-Object Tracking based on RGB-D Sensors
1. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools

Recommendations

Live RGB-D camera tracking for television production studios

Highlights A novel low-cost tool for camera tracking in broadcasting studio environments. Driftless tracking with keyframes. Real-time performance using a GPU. Allows moving actors in the scene while tracking. Comparison with Kinfu. In this work, a real-...
Three-dimensional Object Tracking in RGB Datasets
ICACR 2017: Proceedings of the 2017 International Conference on Automation, Control and Robots

There are many works with many progresses using RGB-D on object tracking when long-term occlusion occurs. However, object tracking needs a higher requirement on hardware, like RGB-D cameras. To solve this problem, this paper proposes a novel depth ...
3D Tracking of Multiple Objects with Identical Appearance Using RGB-D Input
3DV '14: Proceedings of the 2014 2nd International Conference on 3D Vision - Volume 01

Most current approaches for 3D object tracking rely on distinctive object appearances. While several such trackers can be instantiated to track multiple objects independently, this not only neglects that objects should not occupy the same space in 3D, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

icWCSN '23: Proceedings of the 2023 10th International Conference on Wireless Communication and Sensor Networks

January 2023

162 pages

ISBN:9781450398466

DOI:10.1145/3585967

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Science and Technology Projects of State Grid Corporation of China

Conference

icWCSN 2023

icWCSN 2023: 2023 10th International Conference on Wireless Communication and Sensor Networks

January 6 - 8, 2023

Chengdu, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
47
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang XSun ZChehri AJeon GSong Y(2024)Deep learning and multi-modal fusion for real-time multi-object tracking: Algorithms, challenges, datasets, and comparative studyInformation Fusion10.1016/j.inffus.2024.102247105(102247)Online publication date: May-2024
https://doi.org/10.1016/j.inffus.2024.102247

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents