Abstract
Object detection in drone imagery is an interesting topic in the Computer Vision field. This work was widely applied in traffic analysis and control, rescue systems, smart agriculture, etc. However, many challenges exist in developing and optimizing applications because of object density, multi-scale objects, and blur motion. To partly solve the above problems, this research focuses on improving the performance of the YOLOv5m network based on the advantages of the Bi-directional Feature Pyramid Network (BiFPN), Transformer, and Convolutional Block Attention Module (CBAM). The experiments achieve 68.6% and 42.6% of mAP on the proposed datasets (ISLab-Drone) and VisDrone 2021, respectively. That demonstrates the outperformance of the network comparable to other networks under the same testing conditions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Deng, S., et al.: A global-local self-adaptive network for drone-view object detection. IEEE Trans. Image Process. 30, 1556–1569 (2021). https://doi.org/10.1109/TIP.2020.3045636
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. CoRR abs/2010.11929 (2020). https://arxiv.org/abs/2010.11929
Duan, C., Wei, Z., Zhang, C., Qu, S., Wang, H.: Coarse-grained density map guided object detection in aerial images. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 2789–2798 (2021). https://doi.org/10.1109/ICCVW54120.2021.00313
Gu, J., Su, T., Wang, Q., Du, X., Guizani, M.: Multiple moving targets surveillance based on a cooperative network for multi-UAV. IEEE Commun. Mag. 56(4), 82–89 (2018). https://doi.org/10.1109/MCOM.2018.1700422
Hird, J.N., et al.: Use of unmanned aerial vehicles for monitoring recovery of forest vegetation on petroleum well sites. Remote Sens. 9(5) (2017). https://doi.org/10.3390/rs9050413. https://www.mdpi.com/2072-4292/9/5/413
Huang, Y., Chen, J., Huang, D.: UFPMP-Det: toward accurate and efficient object detection on drone imagery. CoRR abs/2112.10415 (2021). https://arxiv.org/abs/2112.10415
Jocher, G., et al.: ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements (2020). https://doi.org/10.5281/zenodo.4154370
Jocher, G., Chaurasia, A., Qiu, J.: Ultralytics yolov8 (2023). https://github.com/ultralytics/ultralytics
Kellenberger, B., Volpi, M., Tuia, D.: Fast animal detection in UAV images using convolutional neural networks. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 866–869 (2017).https://doi.org/10.1109/IGARSS.2017.8127090
Li, C., Yang, T., Zhu, S., Chen, C., Guan, S.: Density map guided object detection in aerial images. CoRR abs/2004.05520 (2020). https://arxiv.org/abs/2004.05520
Lin, H., Zhou, J., Gan, Y., Vong, C.M., Liu, Q.: Novel up-scale feature aggregation for object detection in aerial images. Neurocomputing 411, 364–374 (2020). https://doi.org/10.1016/j.neucom.2020.06.011
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. CoRR abs/1803.01534 (2018). http://arxiv.org/abs/1803.01534
Mittal, P., Sharma, A., Singh, R., Dhull, V.: Dilated convolution based RCNN using feature fusion for low-altitude aerial objects. Expert Syst. Appl. 199, 117106 (2022). https://doi.org/10.1016/j.eswa.2022.117106
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. CoRR abs/1506.02640 (2015). http://arxiv.org/abs/1506.02640
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR abs/1612.08242 (2016). http://arxiv.org/abs/1612.08242
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. CoRR abs/1804.02767 (2018). http://arxiv.org/abs/1804.02767
Ringwald, T., Sommer, L., Schumann, A., Beyerer, J., Stiefelhagen, R.: UAV-net: a fast aerial vehicle detector for mobile platforms. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 544–552 (2019). https://doi.org/10.1109/CVPRW.2019.00080
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. CoRR abs/1911.09070 (2019). http://arxiv.org/abs/1911.09070
Woo, S., Park, J., Lee, J., Kweon, I.S.: CBAM: convolutional block attention module. CoRR abs/1807.06521 (2018). http://arxiv.org/abs/1807.06521
Yang, F., Fan, H., Chu, P., Blasch, E., Ling, H.: Clustered object detection in aerial images. CoRR abs/1904.08008 (2019). http://arxiv.org/abs/1904.08008
Yu, W., Yang, T., Chen, C.: Towards resolving the challenge of long-tail distribution in UAV images for object detection. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3257–3266 (2021). https://doi.org/10.1109/WACV48630.2021.00330
Zhang, J., Huang, J., Chen, X., Zhang, D.: How to fully exploit the abilities of aerial image detectors. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 1–8 (2019).https://doi.org/10.1109/ICCVW.2019.00007
Zhang, R., Newsam, S., Shao, Z., Huang, X., Wang, J., Li, D.: Multi-scale adversarial network for vehicle detection in UAV imagery. ISPRS J. Photogramm. Remote. Sens. 180, 283–295 (2021). https://doi.org/10.1016/j.isprsjprs.2021.08.002
Zhao, Q., Liu, B., Lyu, S., Wang, C., Zhang, H.: TPH-YOLOv5++: boosting object detection on drone-captured scenarios with cross-layer asymmetric transformer. Remote Sens. 15(6) (2023). https://doi.org/10.3390/rs15061687. https://www.mdpi.com/2072-4292/15/6/1687
Zhu, P., et al.: Detection and tracking meet drones challenge. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7380–7399 (2021). https://doi.org/10.1109/TPAMI.2021.3119563
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: Tph-yolov5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. CoRR abs/2108.11539 (2021). https://arxiv.org/abs/2108.11539
Acknowledgement
This result was supported by “Regional Innovation Strategy (RIS)” through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (MOE)(2021RIS-003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nguyen, DL., Vo, XT., Priadana, A., Jo, KH. (2024). Minor Object Recognition from Drone Image Sequence. In: Irie, G., Shin, C., Shibata, T., Nakamura, K. (eds) Frontiers of Computer Vision. IW-FCV 2024. Communications in Computer and Information Science, vol 2143. Springer, Singapore. https://doi.org/10.1007/978-981-97-4249-3_12
Download citation
DOI: https://doi.org/10.1007/978-981-97-4249-3_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4248-6
Online ISBN: 978-981-97-4249-3
eBook Packages: Computer ScienceComputer Science (R0)