YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model
<p>Image data enhancement. (<b>a</b>) Original image, (<b>b</b>) Random flipping, (<b>c</b>) Gaussian noise, (<b>d</b>) Random clipping, (<b>e</b>) brightness change.</p> "> Figure 2
<p>The distribution chart of dataset labels. (<b>a</b>) The amount of data in the training set, and how many instances there are for each category. (<b>b</b>) The size and number of bounding boxes. (<b>c</b>) The position of the center point relative to the entire image. (<b>d</b>) The aspect ratio of the target in the image compared to the entire image.</p> "> Figure 3
<p>The structure of YOLOv8.</p> "> Figure 4
<p>The structure of PAN (<b>a</b>) and GFPN (<b>b</b>).</p> "> Figure 5
<p>Multi-scale feature fusion network structure and module design. (<b>a</b>) RepGFPN removes the up-sample connection and uses the CSPStage module for feature fusion. (<b>b</b>) The CSPStage module performs feature fusion operations. ×N means that there are N structures, which are the same as those in the dashed box. (<b>c</b>) The technology of 3 × 3 Rep represents a method for reparameterizing models, leading to decreased computational requirements and enhanced model efficiency.</p> "> Figure 6
<p>The design of DySample’s modules and its network architecture. (<b>a</b>) X represents the input feature, <span class="html-italic">X</span>′ represents the upsampled feature, and S represents the sampling set. The sampling point generator produces a sampling set, which is then utilized to resample the input feature through the grid sampling function. (<b>b</b>) <math display="inline"><semantics> <msub> <mi>X</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>X</mi> <mn>2</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>X</mi> <mn>3</mn> </msub> </semantics></math> represent offsets with a size of <math display="inline"><semantics> <mrow> <mn>2</mn> <mi>g</mi> <msup> <mi>s</mi> <mn>2</mn> </msup> <mo>×</mo> <mi>H</mi> <mo>×</mo> <mi>W</mi> </mrow> </semantics></math>. O represents the generated offset. G represents the original grid and <math display="inline"><semantics> <mi>σ</mi> </semantics></math> denotes the sigmoid function.</p> "> Figure 7
<p>The structure of SE (<b>a</b>) and CBAM (<b>b</b>).</p> "> Figure 8
<p>The structure of CA.</p> "> Figure 9
<p>The network structure diagram of our YOLOv8-GDCI algorithm.</p> "> Figure 10
<p>Visual Demonstration Using Different Feature Networks. (<b>a</b>) Truth (<b>b</b>) PAN (<b>c</b>) GFPN (<b>d</b>) BiFPN (<b>e</b>) RepGFPN.</p> "> Figure 11
<p>Contrast experiments of different loss functions in loss values.</p> "> Figure 12
<p>The visualization results in different environments.</p> "> Figure 12 Cont.
<p>The visualization results in different environments.</p> ">
Abstract
:1. Introduction
- In this paper, a dataset containing three labels, namely leaf-1, fruit-1, and stem-1, was constructed based on the disease sites, providing data support for the detection of chili Phytophthora blight.
- Using RepGFPN instead of the original Neck network can better integrate the feature maps of different levels without significantly increasing the computational cost, to solve the problem of complex background.
- DySample is used to replace the original upsample algorithm, which can better aggregate context information and introduce less computing overhead.
- To enhance the detection of concealed and compact targets, the CA attention mechanism is integrated into the final stage of the backbone network, which combines both channel and spatial location information.
- Inner-MPDIoU loss replaced CIoU, minimizing corner distances between predicted and true boxes. Furthermore, an auxiliary bounding box has been implemented to enhance the learning capacity for chili disease samples.
2. Materials and Methods
2.1. Data Collection
2.2. Methods
2.2.1. YOLOv8 Model
2.2.2. Multi-Scale Feature Fusion Network RepGFPN
2.2.3. Lightweight DySample Upsampling
2.2.4. CoordAtt Module
2.2.5. Inner-MPDIoU
2.2.6. YOLOv8-GDCI Model
3. Results
3.1. Experimental Environment and Model Evaluation Metrics
3.2. Ablation Experiment
3.3. Experimental Comparison of Different Feature Aggregation Networks
3.4. Comparative Experiments of Different Upsampling Algorithms
3.5. The Contrast Experiment of the Attention Mechanism
- Only added at the end of the backbone network
- Only added at the end of the neck network
- It is added, respectively, at the end of the backbone network and the neck network.
3.6. Contrast Experiments of Different Loss Functions
3.7. Comparative Experiments of Different Models
3.8. Visualization
4. Discussion
- Addressing the potential limitation of inadequate multi-scale feature fusion within the neck network of the conventional YOLOv8n, this paper presents the RepGFPN network architecture as a solution. This improvement, while efficiently managing the computational overhead, fosters the interplay and integration of information across feature representations of varying scales, markedly amplifying the model’s proficiency in identifying multi-scale objects within intricate settings, thereby bolstering its detection performance.
- The Dysample upsampling operator is used to substitute the original upsampling operator, enhancing the model’s receptive field. This operator constructs the upsampling process through a unique point sampling strategy, effectively enlarging the feature receptive field of the model; thereby achieving more delicate and accurate feature reconstruction.
- We integrated the CA attention mechanism into the final stage of the backbone network, which cleverly combines channel information with spatial position information, achieving more comprehensive and accurate target detection.
- Aimed at the issue of the inadequate performance of the CIoU loss function in small object detection tasks, this paper proposes to adopt the Inner-MPDIoU loss function as an alternative. It addresses the issue of limited accuracy in detecting small targets and, to a certain extent, improves the overall detection efficiency and performance.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zou, Z.; Zou, X. Geographical and ecological differences in pepper cultivation and consumption in China. Front. Nutr. 2021, 8, 718517. [Google Scholar] [CrossRef] [PubMed]
- Idoje, G.; Dagiuklas, T.; Iqbal, M. Survey for smart farming technologies: Challenges and issues. Comput. Electr. Eng. 2021, 92, 107104. [Google Scholar] [CrossRef]
- Ozyilmaz, U. Evaluation of the effectiveness of antagonistic bacteria against Phytophthora blight disease in pepper with artificial intelligence. Biol. Control 2020, 151, 104379. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Jocher, G.; Chaurasia, A.; Stoken, A.; Borovec, J.; Kwon, Y.; Michael, K.; Fang, J.; Wong, C.; Yifu, Z.; Montes, D.; et al. Ultralytics/yolov5: V6. 2-yolov5 classification models, apple m1, reproducibility, clearml and deci. ai integrations. Zenodo 2022. Available online: https://zenodo.org/records/7002879 (accessed on 12 July 2023).
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Zhao, J.; Qu, J. Healthy and Diseased Tomatoes Detection Based on YOLOv2. In Proceedings of the Human Centered Computing; Tang, Y., Zu, Q., Rodríguez García, J.G., Eds.; Springer: Cham, Switzerland, 2019; pp. 347–353. [Google Scholar]
- Liu, J.; Wang, X. Early recognition of tomato gray leaf spot disease based on MobileNetv2-YOLOv3 model. Plant Methods 2020, 16, 83. [Google Scholar] [CrossRef] [PubMed]
- Sangaiah, A.K.; Yu, F.N.; Lin, Y.B.; Shen, W.C.; Sharma, A. UAV T-YOLO-Rice: An Enhanced Tiny Yolo Networks for Rice Leaves Diseases Detection in Paddy Agronomy. IEEE Trans. Netw. Sci. Eng. 2024, 1–16. [Google Scholar] [CrossRef]
- Xie, Z.; Li, C.; Yang, Z.; Zhang, Z.; Jiang, J.; Guo, H. YOLOv5s-BiPCNeXt, a Lightweight Model for Detecting Disease in Eggplant Leaves. Plants 2024, 13, 2303. [Google Scholar] [CrossRef]
- Yue, X.; Li, H.; Song, Q.; Zeng, F.; Zheng, J.; Ding, Z.; Kang, G.; Cai, Y.; Lin, Y.; Xu, X.; et al. YOLOv7-GCA: A Lightweight and High-Performance Model for Pepper Disease Detection. Agronomy 2024, 14, 618. [Google Scholar] [CrossRef]
- Yang, S.; Yao, J.; Teng, G. Corn Leaf Spot Disease Recognition Based on Improved YOLOv8. Agriculture 2024, 14, 666. [Google Scholar] [CrossRef]
- Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Jiang, Y.; Tan, Z.; Wang, J.; Sun, X.; Lin, M.; Li, H. Giraffedet: A heavy-neck paradigm for object detection. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
- Xu, X.; Jiang, Y.; Chen, W.; Huang, Y.; Zhang, Y.; Sun, X. Damo-yolo: A report on real-time object detection design. arXiv 2022, arXiv:2211.15444. [Google Scholar]
- Soudy, M.; Afify, Y.; Badr, N. RepConv: A novel architecture for image scene classification on Intel scenes dataset. Int. J. Intell. Comput. Inf. Sci. 2022, 22, 63–73. [Google Scholar] [CrossRef]
- Cunningham, P.; Delany, S.J. K-nearest neighbour classifiers-a tutorial. ACM Comput. Surv. (CSUR) 2021, 54, 1–25. [Google Scholar] [CrossRef]
- Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. Carafe: Content-aware reassembly of features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3007–3016. [Google Scholar]
- Lu, H.; Liu, W.; Fu, H.; Cao, Z. FADE: Fusing the assets of decoder and encoder for task-agnostic upsampling. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 231–247. [Google Scholar]
- Lu, H.; Liu, W.; Fu, H.; Cao, Z. FADE: A Task-Agnostic Upsampling Operator for Encoder–Decoder Architectures. Int. J. Comput. Vis. 2024, 1–22. [Google Scholar] [CrossRef]
- Liu, W.; Lu, H.; Fu, H.; Cao, Z. Learning to upsample by learning to sample. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 6027–6037. [Google Scholar]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Xiong, C.; Zayed, T.; Jiang, X.; Alfalah, G.; Abelkader, E.M. A Novel Model for Instance Segmentation and Quantification of Bridge Surface Cracks—The YOLOv8-AFPN-MPD-IoU. Sensors 2024, 24, 4288. [Google Scholar] [CrossRef]
- Siliang, M.; Yong, X. MPDIoU: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar]
- Zhang, H.; Xu, C.; Zhang, S. Inner-IoU: More effective intersection over union loss with auxiliary bounding box. arXiv 2023, arXiv:2311.02877. [Google Scholar]
- Chen, J.; Mai, H.; Luo, L.; Chen, X.; Wu, K. Effective feature fusion network in BIFPN for small object detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 699–703. [Google Scholar]
- Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. Efficient multi-scale attention module with cross-spatial learning. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
- Yang, L.; Zhang, R.Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine learning, Online, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
Disease Site | Label | Label Number | Disease Characteristics | Picture |
---|---|---|---|---|
leaf | leaf-1 | 1684 | Round or irregularly shaped “watered-in” spots appear on the leaf infection spots, the leaf color becomes darker, and then the spots begin to expand and become dry and brown. | |
fruit | fruit-1 | 1275 | The disease initiates at either the base or the tip, and spreads with irregular water-soaked patches, causing the fruit to turn brown and the pulp to become soft and rotten. | |
stem | stem-1 | 1131 | Initially, dark green, moist, and irregularly shaped patches emerge, which subsequently transform into dark brown to black lesions. |
Parameter | Value |
---|---|
Epochs | 400 |
Workers | 6 |
Optimizer | SGD |
Lr0 | 0.01 |
Lr1 | 0.01 |
momentum | 0.937 |
RepGFPN | Dysample | CA | Inner-MPDIoU | P (%) | R (%) | mAP0.5 (%) | Params(M) | FPS |
---|---|---|---|---|---|---|---|---|
- | - | - | - | 0.905 | 0.824 | 0.872 | 3.15 | 232.8 |
✓ | - | - | - | 0.913 | 0.832 | 0.881 | 3.25 | 209.3 |
✓ | ✓ | - | - | 0.916 | 0.830 | 0.883 | 3.26 | 133 |
✓ | ✓ | ✓ | - | 0.910 | 0.844 | 0.886 | 3.27 | 156 |
✓ | ✓ | ✓ | ✓ | 0.914 | 0.842 | 0.889 | 3.27 | 174 |
RepGFPN | Dysample | CA | Inner-MPDIoU | Leaf-1 | Fruit-1 | Stem-1 |
---|---|---|---|---|---|---|
- | - | - | - | 0.953 | 0.817 | 0.847 |
✓ | - | - | - | 0.953 | 0.833 | 0.857 |
✓ | ✓ | - | - | 0.959 | 0.843 | 0.846 |
✓ | ✓ | ✓ | - | 0.960 | 0.82 | 0.868 |
✓ | ✓ | ✓ | ✓ | 0.960 | 0.835 | 0.871 |
Model | P (%) | R (%) | mAP0.5 (%) | Params (M) | FPS |
---|---|---|---|---|---|
PAN | 0.905 | 0.824 | 0.872 | 3.15 | 232 |
GFPN | 0.897 | 0.841 | 0.877 | 3.36 | 188 |
BiFPN | 0.901 | 0.823 | 0.865 | 1.99 | 211.5 |
RepGFPN | 0.913 | 0.832 | 0.881 | 3.25 | 209.8 |
Model | P (%) | R (%) | mAP0.5 (%) | Params (M) | FPS |
---|---|---|---|---|---|
Nearest | 0.905 | 0.824 | 0.872 | 3.25 | 232 |
Carafe | 0.907 | 0.828 | 0.879 | 3.56 | 133 |
Dysample | 0.916 | 0.83 | 0.883 | 3.26 | 194.9 |
Model | P (%) | R (%) | mAP0.5 (%) | Params (M) |
---|---|---|---|---|
Base | 0.916 | 0.830 | 0.883 | 3.26 |
Base-a | 0.910 | 0.844 | 0.886 | 3.27 |
Base-b | 0.901 | 0.818 | 0.877 | 3.27 |
Base-c | 0.913 | 0.832 | 0.881 | 3.27 |
Model | P (%) | R (%) | mAP0.5 (%) | Params (M) |
---|---|---|---|---|
None | 0.916 | 0.830 | 0.883 | 3.26 |
SE | 0.906 | 0.836 | 0.884 | 3.27 |
CBAM | 0.912 | 0.838 | 0.879 | 3.33 |
EMA | 0.926 | 0.822 | 0.871 | 3.29 |
SiMAM | 0.911 | 0.843 | 0.881 | 3.26 |
CA | 0.910 | 0.844 | 0.886 | 3.27 |
Model | P (%) | R (%) | mAP0.5 (%) |
---|---|---|---|
CIoU | 0.910 | 0.844 | 0.886 |
SIoU | 0.903 | 0.853 | 0.878 |
EIoU | 0.928 | 0.823 | 0.882 |
MPDIoU | 0.917 | 0.839 | 0.886 |
Inner-MPDIoU | 0.914 | 0.842 | 0.889 |
Model | P (%) | R (%) | mAP0.5 (%) | Params (M) | FPS |
---|---|---|---|---|---|
Faster R-CNN | 0.931 | 0.823 | 0.879 | 72.0 | 23 |
SSD | 0.912 | 0.838 | 0.881 | 41.1 | 52.14 |
YOLOv5n | 0.906 | 0.812 | 0.862 | 1.76 | 254.6 |
YOLOv6n | 0.898 | 0.798 | 0.853 | 4.23 | 277 |
YOLOv8n | 0.905 | 0.824 | 0.872 | 3.15 | 232.8 |
YOLOv8s | 0.921 | 0.847 | 0.890 | 11.2 | 160 |
YOLOv8m | 0.950 | 0.826 | 0.892 | 25.9 | 81 |
YOLOv8-GDCI | 0.914 | 0.842 | 0.889 | 3.27 | 219.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, Y.; Han, W.; Guo, P.; Wei, X. YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model. Agronomy 2024, 14, 2734. https://doi.org/10.3390/agronomy14112734
Duan Y, Han W, Guo P, Wei X. YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model. Agronomy. 2024; 14(11):2734. https://doi.org/10.3390/agronomy14112734
Chicago/Turabian StyleDuan, Yulong, Weiyu Han, Peng Guo, and Xinhua Wei. 2024. "YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model" Agronomy 14, no. 11: 2734. https://doi.org/10.3390/agronomy14112734
APA StyleDuan, Y., Han, W., Guo, P., & Wei, X. (2024). YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model. Agronomy, 14(11), 2734. https://doi.org/10.3390/agronomy14112734