Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model
<p>The characteristics of poppies at different growing periods and flying heights.</p> "> Figure 2
<p>Preliminary processing for poppy selection: (<b>a</b>) original images; (<b>b</b>) verified images; (<b>c</b>) labeled images; (<b>d</b>) labeled information.</p> "> Figure 3
<p>Probability density function of the Beta distribution for different values of parameters alpha and beta.</p> "> Figure 4
<p>Fused image resulting from synthesizing the background and poppy images, for poppy detection based on the Beta distribution.</p> "> Figure 5
<p>Various backbone networks: (<b>a</b>) the shortcut connection structure in ResNet; (<b>b</b>) the dense connection in DenseNet; (<b>c</b>) the inverted residual structure in MobileNetv2; (<b>d</b>) the channel shuffle structure in ShuffleNetv2.</p> "> Figure 6
<p>Original spatial pyramid pooling (SPP) structure in the SPP net.</p> "> Figure 7
<p>Improved SPP unit.</p> "> Figure 8
<p>Structure of the SPP-YOLOv3 model.</p> "> Figure 9
<p>(<b>a</b>) Training and (<b>b</b>) validation loss of YOLOv3 based on various backbone networks.</p> "> Figure 10
<p>Precision × Recall (PR) curve of YOLOv3 based on various backbone networks.</p> "> Figure 11
<p><span class="html-italic">F2</span> Score-Recall curve of YOLOv3 based on various backbone networks.</p> "> Figure 12
<p>(<b>a</b>) Training and (<b>b</b>) validation losses of the enhanced YOLOv3 model.</p> "> Figure 13
<p>PR curve of the enhanced YOLOv3 model.</p> "> Figure 14
<p>Partial detection results of the SPP-GIoU-YOLOv3-MN model using the testing dataset.</p> "> Figure 15
<p>Partial test results for complete unmanned aerial vehicle (UAV) images: (<b>a</b>,<b>b</b>) the true detection; (<b>c</b>) false detection; (<b>d</b>) missed detection.</p> "> Figure 16
<p>Structure of SPP3-YOLOv3-MN model.</p> "> Figure 17
<p>(<b>a</b>) Training and (<b>b</b>) validation losses for the SPP3-YOLOv3-MN and SPP-YOLOv3-MN models.</p> "> Figure 18
<p>PR curve of the SPP3-YOLOv3-MN and SPP-YOLOv3-MN models.</p> "> Figure 19
<p>Complexity of the background in UAV images, including buildings, other crops, and shrubs.</p> ">
Abstract
:1. Introduction
2. Study Area and Data
2.1. Data Acquisition
2.2. Data Processing
2.2.1. Preliminary Processing
Algorithm 1: Data Cropping Strategy |
Input: One dataset, A, including N big UAV images. Output: One dataset, B, including cropped images with a fixed size (416 × 416 pixels).
|
2.2.2. Data Fusion
2.2.3. Data Augmentation
Algorithm 2: Data Augmentation Strategy |
Input: The original dataset, A, with N images and a random transform method set, T (including random cropping, random flipping, random rotation, random resizing, random changes in brightness, random sharpening operation, and random noise addition). Output: Enhanced dataset B.
|
3. Methodology
3.1. YOLOv3 Model Based on Multiple Backbone Networks
3.1.1. Backbone Networks
3.1.2. Model Training
3.2. Improved YOLOv3 Model
3.2.1. Improved Spatial Pyramid Pooling Unit
3.2.2. Network Hyperparameter Setting and Model Training
3.3. Trained Model Prediction
3.3.1. Single UAV Image Prediction
3.3.2. Multiple UAV Image Prediction
4. Model Evaluation Metrics
4.1. Intersection over Union
4.2. Precision ×Recall Curve and Average Precision
4.3. Mean Average Precision
4.4. F-Score
5. Results
5.1. Backbone Network Assessment
5.1.1. Training
5.1.2. Testing
5.2. YOLOv3-MobileNetv2 Assessment
5.3. SPP-GIoU-YOLOv3-MN Model Performance with Complete UAV Images
6. Discussion
6.1. Testing One vs. Three SPP Units
6.2. Limitations of the Current Training Dataset
6.2.1. Poppy Complexity
6.2.2. Background Complexity
6.3. Advantages and Applicability of the Proposed Method
6.4. Model Acceleration and Future Work
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
UAV | Unmanned Aerial Vehicles |
CNN | Convolutional Neural Network |
YOLO | You Only Look Once |
IoU | Intersection over Union |
SPP | Spatial Pyramid Pooling |
GIoU | Generalized Intersection over Union |
References
- Taylor, J.C.; Waine, T.W.; Juniper, G.R.; Simms, D.M.; Brewer, T.R. Survey and monitoring of opium poppy and wheat in Afghanistan: 2003–2009. Remote Sens. Lett. 2010, 1, 179–185. [Google Scholar] [CrossRef]
- Liu, X.Y.; Tian, Y.C.; Yuan, C.; Zhang, F.F.; Yang, G. Opium Poppy Detection Using Deep Learning. Remote Sens. 2018, 10, 1886. [Google Scholar] [CrossRef]
- Jia, K.; Wu, B.F.; Tian, Y.C.; Li, Q.Z.; Du, X. Spectral Discrimination of Opium Poppy Using Field Spectrometry. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3414–3422. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the International Conference on the Neural Information Processing Systems Conference, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Ren, S.Q.; He, K.M.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. Lect. Notes Comput. Sci. 2016, 9905, 21–37. [Google Scholar]
- Law, H.; Jia, D. CornerNet: Detecting Objects as Paired Keypoints. arXiv 2018, arXiv:1808.01244. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. arXiv 2019, arXiv:1904.08189v3. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Kendall, A.; Badrinarayanan, V.; Cipolla, R. Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding. arXiv 2015, arXiv:1511.02680. [Google Scholar]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the ICCV 2017 Best Paper Award: Mask R-CNN, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- Howard, A.G.; Zhu, M.; Bo, C.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for MobileNetV3. arXiv 2019, arXiv:1905.02244. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Jian, S. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. arXiv 2017, arXiv:1707.01083. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.-T.; Jian, S. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. arXiv 2018, arXiv:1807.11164. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Shen, Z.Q.; Liu, Z.; Li, J.G.; Jiang, Y.G.; Chen, Y.R.; Xue, X.Y. DSOD: Learning Deeply Supervised Object Detectors from Scratch. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1937–1945. [Google Scholar]
- Ammour, N.; Alhichri, H.; Bazi, Y.; Benjdira, B.; Alajlan, N.; Zuair, M. Deep Learning Approach for Car Detection in UAV Imagery. Remote Sens. 2017, 9, 312. [Google Scholar] [CrossRef]
- Bazi, Y.; Melgani, F. Convolutional SVM Networks for Object Detection in UAV Imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3107–3118. [Google Scholar] [CrossRef]
- Chen, F.; Ren, R.L.; Van de Voorde, T.; Xu, W.B.; Zhou, G.Y.; Zhou, Y. Fast Automatic Airport Detection in Remote Sensing Images Using Convolutional Neural Networks. Remote Sens. 2018, 10, 443. [Google Scholar] [CrossRef]
- Rahnemoonfar, M.; Dobbs, D.; Yari, M.; Starek, M.J. DisCountNet: Discriminating and Counting Network for Real-Time Counting and Localization of Sparse Objects in High-Resolution UAV Imagery. Remote Sens. 2019, 11, 1128. [Google Scholar] [CrossRef]
- Ampatzidis, Y.; Partel, V. UAV-Based High Throughput Phenotyping in Citrus Utilizing Multispectral Imaging and Artificial Intelligence. Remote Sens. 2019, 11, 410. [Google Scholar] [CrossRef]
- Tzutalin. LabelImg. Git Code. 2015. Available online: https://github.com/tzutalin/labelImg (accessed on 3 May 2017).
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond Empirical Risk Minimization. arXiv 2018, arXiv:1710.09412. [Google Scholar]
- Zhang, Z.; He, T.; Zhang, H.; Zhang, Z.; Xie, J.; Li, M. Bag of Freebies for Training Object Detection Neural Networks. arXiv 2019, arXiv:1902.04103. [Google Scholar]
- Srivastava, R.K.; Greff, K.; Schmidhuber, J. Highway Networks. arXiv 2015, arXiv:1505.00387. [Google Scholar]
- Huang, G.; Liu, Z.; Laurens, V.D.M.; Weinberger, K.Q. Densely Connected Convolutional Networks. arXiv 2016, arXiv:1608.06993. [Google Scholar]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. Lect. Notes Comput. Sci. 2014, 8691, 346–361. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 658–666. [Google Scholar]
Flying Height | 30 m | 60 m | 150 m | |
---|---|---|---|---|
Before Balanced | Seedling | 216 | 143 | 42 |
Flowering | 279 | 252 | 108 | |
Balanced | Seedling | 216 | 234 | 126 |
Flowering | 279 | 252 | 128 |
Training | Validation | Testing | |
---|---|---|---|
Number of Images | 2975 | 425 | 850 |
Item | Value |
---|---|
Optimization Method | Adam |
Initial Learning Rate | 0.001 |
Learning Rate Schedule | Validation loss does not decline for 20 epochs, the learning rate increases by 0.1 |
Batch Size | Nearly 10 but six for ResNet and DenseNet |
Training Epochs | 500 |
Early Stopping | Validation loss does not decline for 50 epochs |
Item | Value |
---|---|
Optimization Method | Adam |
Initial Learning Rate | 0.001 |
Learning Rate Schedule | Validation loss does not decline for 20 epochs, the learning rate increases by 0.1 |
Bath Size | 8 |
Training Epochs | 500 |
Early Stopping | Validation loss does not decline for 50 epochs |
Backbone Networks | DarkNet53 | DenseNet121 | ResNet50 | MobileNetv2 | ShuffleNetv2 |
---|---|---|---|---|---|
Convergency Epochs | 257 | 251 | 374 | 346 | 276 |
Backbone Networks | AP 1 (%) | Params (MB) | Testing Time 2 (s) | Speed (FPS) | F2 Score (max) |
---|---|---|---|---|---|
DarkNet53 | 93.00 | 241.1 | 32.7 | 26.0 | 0.927 |
DenseNet121 | 95.14 | 110.4 | 35.0 | 24.3 | 0.953 |
ResNet50 | 95.60 | 419.6 | 38.9 | 21.9 | 0.956 |
MobileNetv2 | 94.75 | 136.3 | 29.1 | 29.2 | 0.942 |
ShuffleNetv2 | 91.09 | 80.1 | 25.5 | 33.3 | 0.913 |
Improvements | YOLOv3-MobileNetv2 | SPP-YOLOv3-MN | SPP-GIoU-YOLOv3-MN |
---|---|---|---|
SPP unit? | √ | √ | |
GIoU? | √ | ||
AP (%) | 94.75 | 95.67 | 96.37 |
Params (MB) | 136.3 | 165.9 | 165.9 |
Testing time (s) | 29.1 | 29.3 | 29.3 |
Speed (FPS) | 29.2 | 29.0 | 29.0 |
F2 score (max) | 0.942 | 0.955 | 0.960 |
Evaluation Index | SPP-GIoU-YOLOv3-MN | YOLOv3-ResNet |
---|---|---|
AP (%) | 96.37 | 95.6 |
Params (MB) | 165.9 | 419.6 |
Testing time (s) | 29.3 | 38.9 |
Speed (FPS) | 29.0 | 21.9 |
F2 score (max) | 0.960 | 0.956 |
Model | AP (%) | Params (MB) | Testing Time (s) | Speed (FPS) | F2 Score (max) |
---|---|---|---|---|---|
SPP-YOLOv3-MN | 95.67 | 165.9 | 29.3 | 29.0 | 0.960 |
SPP3-YOLOv3-MN | 95.51 | 175.1 | 30.2 | 28.1 | 0.954 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Tian, Y.; Yuan, C.; Yin, K.; Yang, G.; Wen, M. Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model. Sensors 2019, 19, 4851. https://doi.org/10.3390/s19224851
Zhou J, Tian Y, Yuan C, Yin K, Yang G, Wen M. Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model. Sensors. 2019; 19(22):4851. https://doi.org/10.3390/s19224851
Chicago/Turabian StyleZhou, Jun, Yichen Tian, Chao Yuan, Kai Yin, Guang Yang, and Meiping Wen. 2019. "Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model" Sensors 19, no. 22: 4851. https://doi.org/10.3390/s19224851
APA StyleZhou, J., Tian, Y., Yuan, C., Yin, K., Yang, G., & Wen, M. (2019). Improved UAV Opium Poppy Detection Using an Updated YOLOv3 Model. Sensors, 19(22), 4851. https://doi.org/10.3390/s19224851