CN118397483A - Fruit tree target positioning and navigation line region segmentation method based on YOLO network - Google Patents
Fruit tree target positioning and navigation line region segmentation method based on YOLO network Download PDFInfo
- Publication number
- CN118397483A CN118397483A CN202410484889.9A CN202410484889A CN118397483A CN 118397483 A CN118397483 A CN 118397483A CN 202410484889 A CN202410484889 A CN 202410484889A CN 118397483 A CN118397483 A CN 118397483A
- Authority
- CN
- China
- Prior art keywords
- fruit tree
- gradient
- convolution
- orchard
- navigation line
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 235000013399 edible fruits Nutrition 0.000 title claims abstract description 91
- 230000011218 segmentation Effects 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000001514 detection method Methods 0.000 claims abstract description 58
- 239000002420 orchard Substances 0.000 claims abstract description 58
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000005507 spraying Methods 0.000 claims abstract description 15
- 239000000575 pesticide Substances 0.000 claims abstract description 9
- 238000001914 filtration Methods 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 abstract description 8
- 230000004438 eyesight Effects 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 description 26
- 230000006870 function Effects 0.000 description 16
- 238000012549 training Methods 0.000 description 7
- 241000196324 Embryophyta Species 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000007261 regionalization Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical compound OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 206010034972 Photosensitivity reaction Diseases 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000007773 growth pattern Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000036211 photosensitivity Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000009333 weeding Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/188—Vegetation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于YOLO网络的果树靶标定位和导航线区域分割方法,属于果园机械的自动导航与作业领域,该方法包括利用无人机获取果园图像;对果园图像依次进行果树靶标目标检测标签处理和导航线分割标签处理,得到无人机果园数据集;根据无人机果园数据集,利用多任务YOLO网络,得到可通行区域分割数据和果树位置检测数据;将可通行区域分割数据和果树位置检测数据分别转换为经纬度信息,得到用于果园农药喷施机器人的导航线和果树位置信息,完成果树靶标定位和导航线区域分割。本发明解决了当前地面机器人因果园背景复杂,视野感知受限的导航和作业困难的问题。
The invention discloses a fruit tree target positioning and navigation line area segmentation method based on a YOLO network, which belongs to the field of automatic navigation and operation of orchard machinery. The method comprises using a drone to obtain an orchard image; performing fruit tree target target detection label processing and navigation line segmentation label processing on the orchard image in sequence to obtain a drone orchard data set; according to the drone orchard data set, using a multi-task YOLO network, obtaining traversable area segmentation data and fruit tree position detection data; converting the traversable area segmentation data and the fruit tree position detection data into longitude and latitude information respectively, obtaining a navigation line and fruit tree position information for an orchard pesticide spraying robot, and completing fruit tree target positioning and navigation line area segmentation. The invention solves the problem that the current ground robot has difficulty in navigation and operation due to the complex orchard background and limited field of vision perception.
Description
技术领域Technical Field
本发明属于果园机械的自动导航与作业领域,尤其涉及一种基于YOLO网络的果树靶标定位和导航线区域分割方法。The invention belongs to the field of automatic navigation and operation of orchard machinery, and in particular relates to a fruit tree target positioning and navigation line area segmentation method based on a YOLO network.
背景技术Background technique
随着全球卫星定位系统的发展,定位精度在不断提高,导致了GNSS、GPS和北斗系统在农业自主导航作业中的广泛集成。导航系统是农业机器人的关键组成部分,直接影响着农业机器人的运行效率。农业机械导航自动化是农业智能化发展的一个基本方面。目前,用于农业机械的自动导航系统多种多样,包括全球导航卫星系统(GNSS)/自动识别系统(INS)集成定位和自适应方根立方卡尔曼滤波器(ASRCKF)。然而,使用导航系统需要建立一个预定义的导航基线,然后通过轨迹跟踪算法实时调整驾驶状态,以实现精确的直线跟踪操作。获取导航基线是实现自动导航的关键步骤。With the development of global satellite positioning systems, positioning accuracy is constantly improving, leading to the widespread integration of GNSS, GPS, and BeiDou systems in agricultural autonomous navigation operations. The navigation system is a key component of agricultural robots and directly affects the operating efficiency of agricultural robots. Agricultural machinery navigation automation is a fundamental aspect of the development of intelligent agriculture. At present, there are various automatic navigation systems for agricultural machinery, including global navigation satellite system (GNSS)/automatic identification system (INS) integrated positioning and adaptive square root cubic Kalman filter (ASRCKF). However, the use of navigation systems requires the establishment of a predefined navigation baseline, and then the driving state is adjusted in real time through the trajectory tracking algorithm to achieve precise straight line tracking operation. Obtaining the navigation baseline is a key step in achieving automatic navigation.
机器视觉因其不可替代的视觉信息和低廉的硬件成本被广泛应用于农业机器人导航中。目前,用于自动提取导航线的视觉方法多种多样,包括检测、分割、分类和激光雷达LiDAR特征提取等方法。在检测方面用于提取导航线的方法,如改进的CS-YOLOv5模型、经ASPPF改进的YOLOv8s模型、YOLO-R集成的DBSCAN以及端到端的YOLO网络。针对传统机器视觉识别作物行易受光照和杂草影响、识别精度低、实时性差等问题。目前使用分割算法来获得作物行和导航路径,现有的技术方案中包括:使用基于变压器的语义分割模型;使用三维相机和惯性测量单元(IMU)的新型自主导航堆栈,它集成了ResNet-50(IMU)和注意机制模块;使用基于视觉的融合植被指数和分割方法;使用改进的多尺度高效残差分解卷积神经网络(MS-ERFNet)模型;使用半监督学习模型及改进的UNet网络。Machine vision is widely used in agricultural robot navigation due to its irreplaceable visual information and low hardware cost. At present, there are various visual methods for automatically extracting navigation lines, including detection, segmentation, classification, and LiDAR feature extraction. In terms of detection, methods for extracting navigation lines include the improved CS-YOLOv5 model, the YOLOv8s model improved by ASPPF, the DBSCAN integrated with YOLO-R, and the end-to-end YOLO network. In view of the problems that traditional machine vision recognition of crop rows is easily affected by light and weeds, has low recognition accuracy, and poor real-time performance. At present, segmentation algorithms are used to obtain crop rows and navigation paths. The existing technical solutions include: using a transformer-based semantic segmentation model; using a new autonomous navigation stack with a three-dimensional camera and an inertial measurement unit (IMU), which integrates ResNet-50 (IMU) and an attention mechanism module; using a vision-based fusion vegetation index and segmentation method; using an improved multi-scale efficient residual decomposition convolutional neural network (MS-ERFNet) model; using a semi-supervised learning model and an improved UNet network.
由于三维点云具有信息多维度高、光敏性低等优点,利用激光雷达获取导航线路的研究已经大量开展。现有的技术方案提出了:一种基于线段的低曲率有效道路自由空间(RFS)提取方法;一种FGSeg算法,以有效区分水平和斜坡地形,可与用于农业现场环境的多种激光雷达传感器兼容;采用激光雷达直方图方法对可穿越的道路区域、障碍物和水害进行综合检测;基于农用车载激光雷达在农田间提取田间直路。Since 3D point clouds have the advantages of high multi-dimensional information and low photosensitivity, a large number of studies have been carried out on the use of LiDAR to obtain navigation routes. Existing technical solutions have proposed: a line segment-based method for extracting low-curvature effective road free space (RFS); an FGSeg algorithm to effectively distinguish between horizontal and slope terrains, which is compatible with a variety of LiDAR sensors used in agricultural field environments; a LiDAR histogram method for comprehensive detection of traversable road areas, obstacles and water hazards; and the extraction of field straight roads between farm fields based on agricultural vehicle-mounted LiDAR.
果树识别和果树靶标定位是施药机器人的关键技术,越来越受到研究者和开发者的关注,现有的技术方案中提出了:一种基于子区域生长和剔除离群值的直弯苗行识别方法;一种基于完全集成的三轴平台和安装在移动探测车上的视觉系统的混合自主机器人除草系统,并使用预训练的深度神经网络将获取的空间、颜色和深度信息用于对土壤、主要作物和待清除对象进行分类;采用CBAM模块、BiFPN结构和双线性插值算法构建了一种新的植物检测模型,可以有效地学习深度信息,并在各种复杂生长状态下区分棉花幼苗中的植物;一种基于宽感受野注意网络(WRA-Net)的运动模糊图像恢复的新方法,并在此基础上研究了如何提高运动模糊图像中作物和杂草的分割精度。尽管目前在自动导航和靶标检测领域已经取得了重大成就,但仍有几个突出问题需要解决,特别是在果园种植方面。这些问题包括:当自主导航仅依赖于单一特征时,如树间路等可穿越区域,其鲁棒性和可靠性将大大降低。由于大面积果园和果树交替生长模式的复杂背景,地面车辆的实时感知能力受到限制。Fruit tree recognition and fruit tree target positioning are key technologies for spraying robots, and are increasingly attracting the attention of researchers and developers. Existing technical solutions include: a straight and curved seedling row recognition method based on sub-region growth and outlier removal; a hybrid autonomous robotic weeding system based on a fully integrated three-axis platform and a visual system mounted on a mobile probe vehicle, and a pre-trained deep neural network is used to obtain spatial, color and depth information for classification of soil, main crops and objects to be removed; a new plant detection model is constructed using a CBAM module, a BiFPN structure and a bilinear interpolation algorithm, which can effectively learn depth information and distinguish plants in cotton seedlings under various complex growth states; a new method for motion blurred image restoration based on a wide receptive field attention network (WRA-Net), and on this basis, how to improve the segmentation accuracy of crops and weeds in motion blurred images is studied. Although significant achievements have been made in the field of automatic navigation and target detection, there are still several outstanding issues that need to be addressed, especially in orchard planting. These issues include: when autonomous navigation relies only on a single feature, such as traversable areas such as tree paths, its robustness and reliability will be greatly reduced. Due to the complex background of large-scale orchards and alternating growth patterns of fruit trees, the real-time perception capability of ground vehicles is limited.
发明内容Summary of the invention
针对现有技术中的上述不足,本发明提供的一种基于YOLO网络的果树靶标定位和导航线区域分割方法,解决了当前地面机器人因果园背景复杂,视野感知受限导致的导航和作业困难的问题。In view of the above-mentioned deficiencies in the prior art, the present invention provides a fruit tree target positioning and navigation line area segmentation method based on the YOLO network, which solves the navigation and operation difficulties of current ground robots caused by complex fruit orchard backgrounds and limited field of view perception.
为了达到上述发明目的,本发明采用的技术方案为:一种基于YOLO网络的果树靶标定位和导航线区域分割方法,包括以下步骤:In order to achieve the above-mentioned invention object, the technical solution adopted by the present invention is: a fruit tree target positioning and navigation line area segmentation method based on YOLO network, comprising the following steps:
S1、利用无人机获取果园图像;S1. Use drones to obtain orchard images;
S2、对果园图像依次进行果树靶标目标检测标签处理和导航线分割标签处理,得到无人机果园数据集;S2, perform fruit tree target detection label processing and navigation line segmentation label processing on the orchard image in sequence to obtain the drone orchard dataset;
S3、根据无人机果园数据集,利用多任务YOLO网络,得到可通行区域分割数据和果树位置检测数据;S3. Based on the drone orchard dataset, the multi-task YOLO network is used to obtain the traversable area segmentation data and fruit tree location detection data;
S4、将可通行区域分割数据和果树位置检测数据分别转换为经纬度信息,得到用于果园农药喷施机器人的导航线和果树位置信息,完成果树靶标定位和导航线区域分割。S4. Convert the traversable area segmentation data and the fruit tree position detection data into longitude and latitude information respectively, obtain the navigation line and fruit tree position information for the orchard pesticide spraying robot, and complete the fruit tree target positioning and navigation line area segmentation.
进一步地,所述步骤S2具体为:Furthermore, the step S2 is specifically as follows:
S201、对果园图像进行筛选,得到筛选后的有效果园图像数据;S201, screening the orchard image to obtain screened effective orchard image data;
S202、根据有效果园图像数据,进行果树靶标锚框标注,并对果树植物和田间车道进行分割标签处理,得到无人机果园数据集。S202: Based on the existing orchard image data, the fruit tree target anchor frame is annotated, and the fruit tree plants and field lanes are segmented and labeled to obtain the drone orchard dataset.
进一步地,所述多任务YOLO网络包括骨干网络backbone、与骨干网络backbone连接的颈部neck以及与颈部neck连接的预测头head。Furthermore, the multi-task YOLO network includes a backbone network backbone, a neck connected to the backbone network backbone, and a prediction head head connected to the neck neck.
进一步地,所述骨干网络backbone包括依次连接的Focus结构、第一Conv2d卷积、第一C2f梯度分流模块、第二Conv2d卷积、第二C2f梯度分流模块、第三Conv2d卷积、第三C2f梯度分流模块、第四Conv2d卷积和SPP结构;所述第二C2f梯度分流模块、第三C2f梯度分流模块和SPP结构均与颈部neck连接。Furthermore, the backbone network backbone includes a Focus structure, a first Conv2d convolution, a first C2f gradient shunt module, a second Conv2d convolution, a second C2f gradient shunt module, a third Conv2d convolution, a third C2f gradient shunt module, a fourth Conv2d convolution and an SPP structure connected in sequence; the second C2f gradient shunt module, the third C2f gradient shunt module and the SPP structure are all connected to the neck.
进一步地,所述颈部neck包括与SPP结构连接的第四C2f梯度分流模块、与第四C2f梯度分流模块连接的第五Conv2d卷积、与第五Conv2d卷积连接的第一Upsample上采样、分别与第一Upsample上采样和第三C2f梯度分流模块连接的第一Concat拼接、与第一Concat拼接连接的第五C2f梯度分流模块、分别与第五C2f梯度分流模块和预测头head连接的第六Conv2d卷积、与第六Conv2d卷积连接的第二Upsample上采样以及分别与第二Upsample上采样、第二C2f梯度分流模块和预测头head连接的第二Concat拼接。Furthermore, the neck includes a fourth C2f gradient shunt module connected to the SPP structure, a fifth Conv2d convolution connected to the fourth C2f gradient shunt module, a first Upsample upsampling connected to the fifth Conv2d convolution, a first Concat splicing connected to the first Upsample upsampling and the third C2f gradient shunt module respectively, a fifth C2f gradient shunt module connected to the first Concat splicing, a sixth Conv2d convolution connected to the fifth C2f gradient shunt module and the prediction head respectively, a second Upsample upsampling connected to the sixth Conv2d convolution, and a second Concat splicing connected to the second Upsample upsampling, the second C2f gradient shunt module and the prediction head respectively.
进一步地,所述预测头head包括用于航道线可行区域分割的导航线区域分割网络和用于果树靶标的目标检测的果树靶标定位网络。Furthermore, the prediction head includes a navigation line region segmentation network for segmenting a feasible area of a waterway line and a fruit tree target positioning network for target detection of fruit tree targets.
进一步地,所述导航线区域分割网络包括与第二Concat拼接连接的第七Conv2d卷积、与第七Conv2d卷积连接的第六C2f梯度分流模块、与第六C2f梯度分流模块连接的第三Upsample上采样、与第三Upsample上采样连接的第八Conv2d卷积、与第八Conv2d卷积连接的第四Upsample上采样、与第四Upsample上采样连接的第七C2f梯度分流模块、与第七C2f梯度分流模块连接的第九Conv2d卷积、与第九Conv2d卷积连接的第五Upsample上采样、与第五Upsample上采样连接的第十Conv2d卷积以及和第十Conv2d卷积连接的Seg_Loss损失函数层。Furthermore, the navigation line area segmentation network includes a seventh Conv2d convolution connected to the second Concat splicing, a sixth C2f gradient shunt module connected to the seventh Conv2d convolution, a third Upsample upsampling connected to the sixth C2f gradient shunt module, an eighth Conv2d convolution connected to the third Upsample upsampling, a fourth Upsample upsampling connected to the eighth Conv2d convolution, a seventh C2f gradient shunt module connected to the fourth Upsample upsampling, a ninth Conv2d convolution connected to the seventh C2f gradient shunt module, a fifth Upsample upsampling connected to the ninth Conv2d convolution, a tenth Conv2d convolution connected to the fifth Upsample upsampling, and a Seg_Loss loss function layer connected to the tenth Conv2d convolution.
进一步地,所述果树靶标定位网络包括与第二Concat拼接连接的第八C2f梯度分流模块、与第八C2f梯度分流模块连接的第十一Conv2d卷积、分别与第十一Conv2d卷积和第六Conv2d卷积连接的第三Concat拼接、与第三Concat拼接连接的第十二Conv2d卷积、与第十二Conv2d卷积连接的第四Concat拼接、与第四Concat拼接连接的第九C2f梯度分流模块、分别与第九C2f梯度分流模块连接的三个检测头;各检测头均包括由一个Conv卷积、一个Conv2d卷积和一个Bbox_Loss损失层连接的第一分支和由一个Conv卷积、一个Conv2d卷积和一个Class_Loss损失层连接的第二分支。Furthermore, the fruit tree target positioning network includes an eighth C2f gradient shunt module connected to the second Concat splicing, an eleventh Conv2d convolution connected to the eighth C2f gradient shunt module, a third Concat splicing connected to the eleventh Conv2d convolution and the sixth Conv2d convolution respectively, a twelfth Conv2d convolution connected to the third Concat splicing, a fourth Concat splicing connected to the twelfth Conv2d convolution, a ninth C2f gradient shunt module connected to the fourth Concat splicing, and three detection heads connected to the ninth C2f gradient shunt module respectively; each detection head includes a first branch connected by a Conv convolution, a Conv2d convolution and a Bbox_Loss loss layer, and a second branch connected by a Conv convolution, a Conv2d convolution and a Class_Loss loss layer.
进一步地,所述步骤S3中多任务YOLO网络的损失函数为:Furthermore, the loss function of the multi-task YOLO network in step S3 is:
Lall=γ1Ldet+γ2Lda-seg L all =γ 1 L det +γ 2 L da-seg
Ldet=α1Lclass+α2Lobj+α3Lbox L det =α 1 L class +α 2 L obj +α 3 L box
Lda-seg=Lce L da-seg =L ce
其中,Lall为多任务YOLO网络的损失函数;γ1为检测损失权重;Ldet为检测损失;γ2为分割损失权重;Lda-seg为分割损失;α1、α2和α3均为超参数;Lclass为分类损失,具体为焦点损失函数;Lobj为目标损失,具体为焦点损失函数;Lbox为边界框损失,具体为CloU损失函数;Lce为带有Logits的交叉熵损失。Among them, L all is the loss function of the multi-task YOLO network; γ 1 is the detection loss weight; L det is the detection loss; γ 2 is the segmentation loss weight; L da-seg is the segmentation loss; α 1 , α 2 and α 3 are all hyperparameters; L class is the classification loss, specifically the focal loss function; L obj is the target loss, specifically the focal loss function; L box is the bounding box loss, specifically the CloU loss function; L ce is the cross entropy loss with Logits.
进一步地,所述步骤S4具体为:Furthermore, the step S4 is specifically as follows:
S401、将果树位置检测数据转换为经纬度信息,得到果树位置信息;S401, converting the fruit tree location detection data into longitude and latitude information to obtain the fruit tree location information;
S402、根据可通行区域分割数据,提取可通行区域的色块对应的边缘线数据;S402, extracting edge line data corresponding to color blocks in the passable area according to the passable area segmentation data;
S403、过滤可通行区域的色块对应的边缘线数据的噪点,得到导航线像素数据;S403, filtering the noise points of the edge line data corresponding to the color block in the passable area to obtain the navigation line pixel data;
S404、将导航线像素数据转换为经纬度信息,得到用于果园农药喷施机器人的导航线,完成果树靶标定位和导航线区域分割。S404, converting the navigation line pixel data into longitude and latitude information, obtaining the navigation line for the orchard pesticide spraying robot, and completing the fruit tree target positioning and navigation line area segmentation.
本发明的有益效果为:YOLO算法需要优质的数据集,本发明可以筛选出优质图片数据集,同时可以让算法以有监督学习的方式,学习果园作业所需要的果园靶标、果园导航线数据信息;本发明通过三个结构,可以实现更好的数据学习,backbone是模型的主要组成部分,通常是一个卷积神经网络(CNN)或残差神经网络(ResNet)等。backbone负责提取输入图像的特征,以便后续的处理和分析。backbone通常具有许多层和许多参数,可以提取出图像的高级特征表示。neck是连接backbone和head的中间层。neck的主要作用是对来自backbone的特征进行降维或调整,以便更好地适应任务要求。neck可以采用卷积层、池化层或全连接层等。head是模型的最后一层,通常是一个分类器或回归器。head通过输入经过neck处理过的特征,产生最终的输出结果;通过算法预测到的图像信息,结合图像内的经纬度信息,可以直接翻译转化为果园农药喷施机器人可以直接执行的实际经纬度信息。可以大大减少了人为工作量,提升了作业效率和便捷程度。无须人工果园实地勘察,仅需要无人机规划飞行,然后算法识别,信息转换,就可得到最终可执行的任务信息。本发明集成了C2f梯度分流模块和无锚模块,可以同时进行树间区域分割和果树检测任务,增强了算法的果树靶标检测和树间路识别能力,优化了资源利用率,提高了任务执行效率;可通过利用空中无人机视角解决地面机器人有限的感知能力;本发明所提出的多任务YOLO网络具有良好的性能,适用于果园机械的自动导航与作业。The beneficial effects of the present invention are as follows: the YOLO algorithm requires a high-quality data set, and the present invention can screen out a high-quality image data set, and at the same time, the algorithm can learn the orchard target and orchard navigation line data information required for orchard operations in a supervised learning manner; the present invention can achieve better data learning through three structures, and the backbone is the main component of the model, which is usually a convolutional neural network (CNN) or a residual neural network (ResNet). The backbone is responsible for extracting the features of the input image for subsequent processing and analysis. The backbone usually has many layers and many parameters, and can extract high-level feature representations of the image. The neck is an intermediate layer connecting the backbone and the head. The main function of the neck is to reduce the dimension or adjust the features from the backbone to better meet the task requirements. The neck can use a convolutional layer, a pooling layer or a fully connected layer. The head is the last layer of the model, which is usually a classifier or a regressor. The head generates the final output result by inputting the features processed by the neck; the image information predicted by the algorithm, combined with the longitude and latitude information in the image, can be directly translated into the actual longitude and latitude information that can be directly executed by the orchard pesticide spraying robot. It can greatly reduce the manual workload and improve the work efficiency and convenience. There is no need for on-site survey of artificial orchards. It only requires drone flight planning, algorithm recognition, and information conversion to obtain the final executable task information. The present invention integrates the C2f gradient shunting module and the anchor-free module, which can simultaneously perform tree-area segmentation and fruit tree detection tasks, enhance the algorithm's fruit tree target detection and tree-road recognition capabilities, optimize resource utilization, and improve task execution efficiency; the limited perception capabilities of ground robots can be solved by utilizing the perspective of aerial drones; the multi-task YOLO network proposed in the present invention has good performance and is suitable for automatic navigation and operation of orchard machinery.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明的方法流程图。FIG. 1 is a flow chart of the method of the present invention.
图2为本发明的多任务YOLO网络结构图。FIG2 is a diagram of the multi-task YOLO network structure of the present invention.
具体实施方式Detailed ways
下面对本发明的具体实施方式进行描述,以便于本技术领域的技术人员理解本发明,但应该清楚,本发明不限于具体实施方式的范围,对本技术领域的普通技术人员来讲,只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内,这些变化是显而易见的,一切利用本发明构思的发明创造均在保护之列。The specific implementation modes of the present invention are described below to facilitate the understanding of the present invention by those skilled in the art. However, it should be clear that the present invention is not limited to the scope of the specific implementation modes. For those of ordinary skill in the art, as long as various changes are within the spirit and scope of the present invention as defined and determined by the attached claims, these changes are obvious, and all inventions and creations utilizing the concept of the present invention are protected.
如图1所示,在本发明的一个实施例中,一种基于YOLO网络的果树靶标定位和导航线区域分割方法,包括以下步骤:As shown in FIG1 , in one embodiment of the present invention, a method for fruit tree target positioning and navigation line area segmentation based on a YOLO network includes the following steps:
S1、利用无人机获取果园图像;S1. Use drones to obtain orchard images;
S2、对果园图像依次进行果树靶标目标检测标签处理和导航线分割标签处理,得到无人机果园数据集;S2, perform fruit tree target detection label processing and navigation line segmentation label processing on the orchard image in sequence to obtain the drone orchard dataset;
S3、根据无人机果园数据集,利用多任务YOLO网络,得到可通行区域分割数据和果树位置检测数据;S3. Based on the drone orchard dataset, the multi-task YOLO network is used to obtain the traversable area segmentation data and fruit tree location detection data;
S4、将可通行区域分割数据和果树位置检测数据分别转换为经纬度信息,得到用于果园农药喷施机器人的导航线和果树位置信息,完成果树靶标定位和导航线区域分割。S4. Convert the traversable area segmentation data and the fruit tree position detection data into longitude and latitude information respectively, obtain the navigation line and fruit tree position information for the orchard pesticide spraying robot, and complete the fruit tree target positioning and navigation line area segmentation.
本实施例中,本发明的目的是解决当前地面机器人因果园背景复杂而视野感知受限的问题。研究基于无人机视角捕获的扩展数据提出了一种新的多任务YOLO网络,用于果树靶标定位和导航线区域分割,获取实际靶标和导航线坐标的经纬度信息。首先,改进的多任务YOLO算法在无人机图像中分割果园的施药器械可穿越区域并检测果树靶标的位置。随后,对分割的可遍历区域进行导航线提取,包括区域边缘提取、轮廓区域化、滤波运算、导航线提取等一系列操作,获取当航线。为了生成可执行信息,使用软件将识别的树间航线像素点和果树靶标定位像素转换为实际的纬度和经度信息供施药车执行。改进的多任务YOLO网络采用C2f和无锚模块对多任务YOLO算法进行增强。采用C2f和无锚模块对多任务YOLO算法进行增强,旨在提高对背景复杂的果园中的果树靶标和树间路航线提取的检测性能。In this embodiment, the purpose of the present invention is to solve the problem that the current ground robot has limited field of vision perception due to the complex background of the orchard. Based on the extended data captured by the drone perspective, a new multi-task YOLO network is proposed for fruit tree target positioning and navigation line area segmentation, and the latitude and longitude information of the actual target and navigation line coordinates are obtained. First, the improved multi-task YOLO algorithm segments the orchard's spraying equipment traversable area in the drone image and detects the position of the fruit tree target. Subsequently, the segmented traversable area is subjected to navigation line extraction, including a series of operations such as regional edge extraction, contour regionalization, filtering operation, and navigation line extraction to obtain the current route. In order to generate executable information, software is used to convert the identified tree-to-tree route pixel points and fruit tree target positioning pixels into actual latitude and longitude information for the spraying vehicle to execute. The improved multi-task YOLO network uses C2f and anchor-free modules to enhance the multi-task YOLO algorithm. The multi-task YOLO algorithm is enhanced by using C2f and anchor-free modules to improve the detection performance of fruit tree targets and tree-to-tree route extraction in orchards with complex backgrounds.
所述步骤S2具体为:The step S2 is specifically as follows:
S201、对果园图像进行筛选,得到筛选后的有效果园图像数据;S201, screening the orchard image to obtain screened effective orchard image data;
S202、根据有效果园图像数据,进行果树靶标锚框标注,并对果树植物和田间车道进行分割标签处理,得到无人机果园数据集。S202: Based on the existing orchard image data, the fruit tree target anchor frame is annotated, and the fruit tree plants and field lanes are segmented and labeled to obtain the drone orchard dataset.
本实施例中,本发明的算法数据集通过大疆无人机进行采集,对无人机采集的数据进行打标签操作,一个是果树靶标目标检测标签处理,一个是导航线分割标签处理。采用LabelMe对果树靶标和田间通道进行图像标注。所有相关信息,包括目标对象类别、目标框架坐标、果树分割和田间通道分割信息,均以*.JSON格式存储。标记的数据集按8:1:1的比例分为训练集、验证集和测试集,其中验证集和测试集主要用于评估本发明中网络的性能。In this embodiment, the algorithm data set of the present invention is collected by a DJI drone, and the data collected by the drone is labeled, one is the fruit tree target detection label processing, and the other is the navigation line segmentation label processing. LabelMe is used to annotate the fruit tree targets and field channels. All relevant information, including target object categories, target frame coordinates, fruit tree segmentation and field channel segmentation information, are stored in *.JSON format. The labeled data set is divided into a training set, a validation set, and a test set in a ratio of 8:1:1, wherein the validation set and the test set are mainly used to evaluate the performance of the network in the present invention.
如图2所示,所述多任务YOLO网络包括骨干网络backbone、与骨干网络backbone连接的颈部neck以及与颈部neck连接的预测头head。As shown in FIG2 , the multi-task YOLO network includes a backbone network backbone, a neck connected to the backbone network backbone, and a prediction head head connected to the neck neck.
所述骨干网络backbone包括依次连接的Focus结构、第一Conv2d卷积、第一C2f梯度分流模块、第二Conv2d卷积、第二C2f梯度分流模块、第三Conv2d卷积、第三C2f梯度分流模块、第四Conv2d卷积和SPP结构;所述第二C2f梯度分流模块、第三C2f梯度分流模块和SPP结构均与颈部neck连接。The backbone network backbone includes a Focus structure, a first Conv2d convolution, a first C2f gradient shunt module, a second Conv2d convolution, a second C2f gradient shunt module, a third Conv2d convolution, a third C2f gradient shunt module, a fourth Conv2d convolution and an SPP structure connected in sequence; the second C2f gradient shunt module, the third C2f gradient shunt module and the SPP structure are all connected to the neck.
所述颈部neck包括与SPP结构连接的第四C2f梯度分流模块、与第四C2f梯度分流模块连接的第五Conv2d卷积、与第五Conv2d卷积连接的第一Upsample上采样、分别与第一Upsample上采样和第三C2f梯度分流模块连接的第一Concat拼接、与第一Concat拼接连接的第五C2f梯度分流模块、分别与第五C2f梯度分流模块和预测头head连接的第六Conv2d卷积、与第六Conv2d卷积连接的第二Upsample上采样以及分别与第二Upsample上采样、第二C2f梯度分流模块和预测头head连接的第二Concat拼接。The neck includes a fourth C2f gradient shunt module connected to the SPP structure, a fifth Conv2d convolution connected to the fourth C2f gradient shunt module, a first Upsample upsampling connected to the fifth Conv2d convolution, a first Concat splicing connected to the first Upsample upsampling and the third C2f gradient shunt module respectively, a fifth C2f gradient shunt module connected to the first Concat splicing, a sixth Conv2d convolution connected to the fifth C2f gradient shunt module and a prediction head respectively, a second Upsample upsampling connected to the sixth Conv2d convolution, and a second Concat splicing connected to the second Upsample upsampling, the second C2f gradient shunt module and the prediction head respectively.
所述预测头head包括用于航道线可行区域分割的导航线区域分割网络和用于果树靶标的目标检测的果树靶标定位网络。The prediction head includes a navigation line region segmentation network for segmenting the feasible region of the waterway line and a fruit tree target positioning network for object detection of the fruit tree target.
所述导航线区域分割网络包括与第二Concat拼接连接的第七Conv2d卷积、与第七Conv2d卷积连接的第六C2f梯度分流模块、与第六C2f梯度分流模块连接的第三Upsample上采样、与第三Upsample上采样连接的第八Conv2d卷积、与第八Conv2d卷积连接的第四Upsample上采样、与第四Upsample上采样连接的第七C2f梯度分流模块、与第七C2f梯度分流模块连接的第九Conv2d卷积、与第九Conv2d卷积连接的第五Upsample上采样、与第五Upsample上采样连接的第十Conv2d卷积以及和第十Conv2d卷积连接的Seg_Loss损失函数层。The navigation line area segmentation network includes a seventh Conv2d convolution connected to the second Concat splicing, a sixth C2f gradient shunt module connected to the seventh Conv2d convolution, a third Upsample upsampling connected to the sixth C2f gradient shunt module, an eighth Conv2d convolution connected to the third Upsample upsampling, a fourth Upsample upsampling connected to the eighth Conv2d convolution, a seventh C2f gradient shunt module connected to the fourth Upsample upsampling, a ninth Conv2d convolution connected to the seventh C2f gradient shunt module, a fifth Upsample upsampling connected to the ninth Conv2d convolution, a tenth Conv2d convolution connected to the fifth Upsample upsampling, and a Seg_Loss loss function layer connected to the tenth Conv2d convolution.
所述果树靶标定位网络包括与第二Concat拼接连接的第八C2f梯度分流模块、与第八C2f梯度分流模块连接的第十一Conv2d卷积、分别与第十一Conv2d卷积和第六Conv2d卷积连接的第三Concat拼接、与第三Concat拼接连接的第十二Conv2d卷积、与第十二Conv2d卷积连接的第四Concat拼接、与第四Concat拼接连接的第九C2f梯度分流模块、分别与第九C2f梯度分流模块连接的三个检测头;各检测头均包括由一个Conv卷积、一个Conv2d卷积和一个Bbox_Loss损失层连接的第一分支和由一个Conv卷积、一个Conv2d卷积和一个Class_Loss损失层连接的第二分支。The fruit tree target positioning network includes an eighth C2f gradient shunt module connected to the second Concat splicing, an eleventh Conv2d convolution connected to the eighth C2f gradient shunt module, a third Concat splicing connected to the eleventh Conv2d convolution and the sixth Conv2d convolution respectively, a twelfth Conv2d convolution connected to the third Concat splicing, a fourth Concat splicing connected to the twelfth Conv2d convolution, a ninth C2f gradient shunt module connected to the fourth Concat splicing, and three detection heads connected to the ninth C2f gradient shunt module respectively; each detection head includes a first branch connected by a Conv convolution, a Conv2d convolution and a Bbox_Loss loss layer, and a second branch connected by a Conv convolution, a Conv2d convolution and a Class_Loss loss layer.
本实施例中,多任务网络使果树靶标机器人实现导航无人化和靶标识别精确化。机器人利用深度视觉感知的无人机理解场景,为运动规划模块提供导航数据,并提供果树位置和可行驶区域的信息。选择多任务YOLO网络作为基础框架,并进行了增强以同时进行果树靶标检测和田间车道线区域分割。In this embodiment, the multi-task network enables the fruit tree target robot to achieve unmanned navigation and accurate target identification. The robot uses the drone with deep visual perception to understand the scene, provide navigation data for the motion planning module, and provide information on the location of the fruit trees and the drivable area. The multi-task YOLO network is selected as the basic framework and enhanced to simultaneously perform fruit tree target detection and field lane line area segmentation.
本实施例中,改进的多任务YOLO网络由backbone、neck和head组成,其中backbone由1个Focus结构、4个Conv2d、3个C2f和1个SPP结构组成。Neck采用PAN/FAN结构,通过多个Concat和Umsample进行上下采样,其中连接着C2f和Conv2d。Head部分由两个部分,一部分用于航道线可行区域分割,一部分用于果树靶标的目标检测。其中,分割部分的head由4个Conv2d、3个Cpsample、2个C2f和一个Seg的Loss函数组成。目标检测部分的head则由共通的2个C2f、2个Conv2d和2个Concat组成,延伸出3个不同尺度的检测头,也就是anchor-free结构。最终,预测输出靶标树的目标检测结果和可通行区域,用于后续航道线提取。多任务YOLO网络共享一个由骨干网络和颈部组成的编码器,并结合两个解码器分别解决果树靶标检测和果树靶标机器人导航线提取问题。In this embodiment, the improved multi-task YOLO network consists of a backbone, a neck and a head, wherein the backbone consists of a Focus structure, four Conv2d, three C2f and a SPP structure. The neck adopts a PAN/FAN structure, and performs up and down sampling through multiple Concat and Umsample, in which C2f and Conv2d are connected. The head part consists of two parts, one for the segmentation of the feasible area of the waterway line, and the other for the target detection of the fruit tree target. Among them, the head of the segmentation part is composed of 4 Conv2d, 3 Cpsample, 2 C2f and a Seg Loss function. The head of the target detection part is composed of 2 common C2f, 2 Conv2d and 2 Concat, extending 3 detection heads of different scales, that is, an anchor-free structure. Finally, the target detection results and passable areas of the target tree are predicted and output for subsequent waterway line extraction. The multi-task YOLO network shares an encoder consisting of a backbone network and a neck, and combines two decoders to solve the problems of fruit tree target detection and fruit tree target robot navigation line extraction respectively.
所述步骤S3中多任务YOLO网络的损失函数为:The loss function of the multi-task YOLO network in step S3 is:
Lall=γ1Ldet+γ2Lda-seg L all =γ 1 L det +γ 2 L da-seg
Ldet=α1Lclass+α2Lobj+α3Lbox L det =α 1 L class +α 2 L obj +α 3 L box
Lda-seg=Lce L da-seg =L ce
其中,Lall为多任务YOLO网络的损失函数;γ1为检测损失权重;Ldet为检测损失;γ2为分割损失权重;Lda-seg为分割损失;α1、α2和α3均为超参数;Lclass为分类损失,具体为焦点损失函数;Lobj为目标损失,具体为焦点损失函数;Lbox为边界框损失,具体为CloU损失函数;Lce为带有Logits的交叉熵损失。Among them, L all is the loss function of the multi-task YOLO network; γ 1 is the detection loss weight; L det is the detection loss; γ 2 is the segmentation loss weight; L da-seg is the segmentation loss; α 1 , α 2 and α 3 are all hyperparameters; L class is the classification loss, specifically the focal loss function; L obj is the target loss, specifically the focal loss function; L box is the bounding box loss, specifically the CloU loss function; L ce is the cross entropy loss with Logits.
本实施例中,选择一个合适的用于训练检测网络的硬件配置和软件环境。该平台是一台运行Ubuntu 22.04的台式电脑,配备了高性能的Intel I9-13900K CPU、64GB内存和NVIDIA 4090 24GB GPU。实验代码使用了CUDA 11.1、cuDNN 8.6.0、Python 3.8、PyTorch1.8.0和VSCode。网络输入大小设置为640×640,批量大小为16。在训练过程中,使用了Adam随机梯度下降算法,初始学习率设置为0.01,权重衰减系数为0.0005,动量为0.937。此外,研究模型加载了官方的预训练权重,以加速训练过程和提高模型性能。In this embodiment, a suitable hardware configuration and software environment for training the detection network is selected. The platform is a desktop computer running Ubuntu 22.04, equipped with a high-performance Intel I9-13900K CPU, 64GB memory, and NVIDIA 4090 24GB GPU. The experimental code uses CUDA 11.1, cuDNN 8.6.0, Python 3.8, PyTorch1.8.0, and VSCode. The network input size is set to 640×640 and the batch size is 16. During the training process, the Adam stochastic gradient descent algorithm was used, the initial learning rate was set to 0.01, the weight decay coefficient was 0.0005, and the momentum was 0.937. In addition, the research model is loaded with official pre-trained weights to accelerate the training process and improve model performance.
多任务YOLO网络包含两个输出头,分别用于检测和分割,因此其损失函数包含检测损失和分割损失两部分。检测损失由分类损失、目标损失和边界框损失组成,其中分类损失和目标损失使用焦点损失函数,以减少已良好分类样本的损失,使网络更专注于难以分类的样本。边界框损失则采用LCIoU,综合考虑预测框与真实框的距离、重叠度、相似度及长宽比。可驾驶区域分割的损失使用带有Logits的交叉熵损失,旨在最小化网络输出像素与目标像素之间的分类错误。最终的总损失是这三部分损失的加权和,需要调整超参数以平衡多任务感知的不同子任务。The multi-task YOLO network contains two output heads, one for detection and one for segmentation, so its loss function contains two parts: detection loss and segmentation loss. The detection loss consists of classification loss, target loss and bounding box loss. The classification loss and target loss use focal loss functions to reduce the loss of well-classified samples and make the network more focused on samples that are difficult to classify. The bounding box loss uses LCIoU, which comprehensively considers the distance, overlap, similarity and aspect ratio between the predicted box and the true box. The loss of drivable area segmentation uses cross entropy loss with Logits, which aims to minimize the classification error between the network output pixels and the target pixels. The final total loss is the weighted sum of these three losses, and hyperparameters need to be adjusted to balance the different subtasks of multi-task perception.
本实施例中,为了提升模型性能,采用了多种数据增强技术和方法。使用k-means聚类算法从数据集中提取先验锚点,增强检测器对果树靶标场景中物体的先验知识。在模型优化方面,选择Adam作为优化器,并设置初始学习率、γ1和γ2分别为0.001、0.937和0.999。采用预热和余弦退火策略来调整学习率,以加快收敛速度并提高模型性能。数据增强用于增加图像多样性,确保模型在不同环境下具有鲁棒性。In this embodiment, in order to improve the performance of the model, a variety of data enhancement techniques and methods are used. The k-means clustering algorithm is used to extract prior anchor points from the data set to enhance the detector's prior knowledge of objects in the fruit tree target scene. In terms of model optimization, Adam is selected as the optimizer, and the initial learning rate, γ 1 and γ 2 are set to 0.001, 0.937 and 0.999 respectively. The preheating and cosine annealing strategies are used to adjust the learning rate to speed up the convergence speed and improve the model performance. Data enhancement is used to increase image diversity and ensure that the model is robust in different environments.
所述步骤S4具体为:The step S4 is specifically as follows:
S401、将果树位置检测数据转换为经纬度信息,得到果树位置信息;S401, converting the fruit tree location detection data into longitude and latitude information to obtain the fruit tree location information;
S402、根据可通行区域分割数据,提取可通行区域的色块对应的边缘线数据;S402, extracting edge line data corresponding to color blocks in the passable area according to the passable area segmentation data;
S403、过滤可通行区域的色块对应的边缘线数据的噪点,得到导航线像素数据;S403, filtering the noise points of the edge line data corresponding to the color block in the passable area to obtain the navigation line pixel data;
S404、将导航线像素数据转换为经纬度信息,得到用于果园农药喷施机器人的导航线,完成果树靶标定位和导航线区域分割。S404, converting the navigation line pixel data into longitude and latitude information to obtain the navigation line for the orchard pesticide spraying robot, thereby completing the fruit tree target positioning and the navigation line area segmentation.
本实施例中,对于导航线提取,需要基于多任务YOLO网络预测所得的无人车可通过区域进行导航线提取。主要步骤:图像输入、多任务YOLO算法预测可通行区域、分割区域颜色块提取和边缘提取、去噪以及导航线提取。In this embodiment, for navigation line extraction, it is necessary to extract navigation lines based on the unmanned vehicle passable area predicted by the multi-task YOLO network. The main steps are: image input, multi-task YOLO algorithm predicts the passable area, segmented area color block extraction and edge extraction, denoising and navigation line extraction.
由于多任务YOLO网络预测的为可通行区域的色块,所以需要提取色块对应的边缘区域,以用于导航线的提取。本实施例中采用的区域提取算法涉及以下步骤:Since the multi-task YOLO network predicts the color blocks in the passable area, it is necessary to extract the edge area corresponding to the color block for extracting the navigation line. The area extraction algorithm used in this embodiment involves the following steps:
步骤1:将蓝色像素的HSV值设置为范围阈值(110,50,50)到(130,255,255)。将红色像素的HSV值设置为范围阈值(0,100,100)到(10,255,255)。Step 1: Set the HSV values of blue pixels to the range threshold (110, 50, 50) to (130, 255, 255). Set the HSV values of red pixels to the range threshold (0, 100, 100) to (10, 255, 255).
步骤2:图像处理:对图像进行二值化处理,从而采用仅由0和1表示的像素值格式。在阈值范围内的像素值变为0,而阈值范围外的像素值变为1。Step 2: Image processing: Binarize the image to adopt a pixel value format represented by only 0 and 1. Pixel values within the threshold range become 0, while pixel values outside the threshold range become 1.
步骤3:图像遍历:使用一个左侧为零、右侧为一的核。从左上角开始遍历图像,从左到右、从上到下进行。在遍历过程中遇到的第一个与指定核对齐的点被确定为几何实体的外边界的初始点。Step 3: Image traversal: Use a kernel with zeros on the left and ones on the right. Traverse the image starting from the upper left corner, from left to right and from top to bottom. The first point encountered during the traversal that is aligned with the specified kernel is determined as the initial point of the outer boundary of the geometric entity.
步骤4:边界跟踪建立边界跟踪的特定方向。然后启动循环搜索点以确定几何结构的边界。Step 4: Boundary Tracking A specific direction for boundary tracking is established. A loop is then started to search for points to determine the boundaries of the geometry.
步骤5:根据几何边界计算色块中间对应的导航线。Step 5: Calculate the corresponding navigation line in the middle of the color block based on the geometric boundary.
多任务YOLO网络在处理无人机图像时可能产生大量噪声和预测错误,特别是在预测颜色块时。为了提高导航线提取的准确性,采用了一种基于像素数量的过滤方法,以消除算法错误,从而保留了果树和田间车道等具有大量像素点的颜色块。在处理完颜色块后,进行去噪操作,并使用矩形矩阵对颜色块进行拟合,最后提取出导航线。这种方法提高了导航线提取的稳健性和效率,相较于传统的基于作物行或田间车道的单任务方法,提供了一种新颖的解决方案。The multi-task YOLO network can generate a lot of noise and prediction errors when processing drone images, especially when predicting color blocks. To improve the accuracy of navigation line extraction, a filtering method based on the number of pixels is used to eliminate algorithmic errors, thereby retaining color blocks with a large number of pixels such as fruit trees and field lanes. After processing the color blocks, denoising is performed and the color blocks are fitted using a rectangular matrix, and finally the navigation lines are extracted. This method improves the robustness and efficiency of navigation line extraction and provides a novel solution compared to traditional single-task methods based on crop rows or field lanes.
算法预测效果验证:Verification of algorithm prediction effect:
为了评估多任务YOLO网络在果树靶标检测和导航线分割方面的性能,利用无人机拍摄的图像作为测试集,并对模型进行了250个周期的训练。训练过程中,观察到了模型的loss逐渐下降,同时mIoU和Acc在第50个周期后显著上升。在第250个周期时,模型收敛,果树靶标检测和导航线分割的准确率分别达到了77.80%和84.37%。这些结果表明,该模型在果树靶标检测和导航线分割方面具有较高的准确性,有助于减少农业经济损失并推动该领域新型方法的应用。In order to evaluate the performance of the multi-task YOLO network in fruit tree target detection and navigation line segmentation, images taken by drones were used as test sets, and the model was trained for 250 cycles. During the training process, it was observed that the loss of the model gradually decreased, while mIoU and Acc increased significantly after the 50th cycle. At the 250th cycle, the model converged, and the accuracy of fruit tree target detection and navigation line segmentation reached 77.80% and 84.37%, respectively. These results show that the model has high accuracy in fruit tree target detection and navigation line segmentation, which helps to reduce agricultural economic losses and promote the application of new methods in this field.
根据交替更新策略训练的多任务YOLO网络,在果树靶标检测和导航线区域分割任务中表现出色,准确率、召回率、mAP50、mIoU和准确率分别达到了84.37%、75.30%、81.01%、77.80%和86.35%。本实施例中,通过引入C2f模块和无锚点模块优化了基线算法YOLOP,提高了模型的性能。C2f模块结合了低级和高级特征图,增强了模型捕捉细微梯度流信息的能力。无锚点模块则减少了对锚定框架的依赖,进一步优化了模型。测试集上的结果表明,这些改进有效提高了多任务YOLO的准确率和mIoU。The multi-task YOLO network trained according to the alternating update strategy performed well in the tasks of fruit tree target detection and navigation line area segmentation, with accuracy, recall, mAP50, mIoU and accuracy reaching 84.37%, 75.30%, 81.01%, 77.80% and 86.35% respectively. In this embodiment, the baseline algorithm YOLOP is optimized by introducing the C2f module and the anchor-free module to improve the performance of the model. The C2f module combines low-level and high-level feature maps to enhance the model's ability to capture subtle gradient flow information. The anchor-free module reduces the reliance on the anchor framework and further optimizes the model. The results on the test set show that these improvements effectively improve the accuracy and mIoU of the multi-task YOLO.
多任务YOLO在导航线区域分割任务中表现出色。对比实验结果表明,尽管可能不如PP-liteseg,但其作为多任务集成算法具有优势。在复杂的无人机果园图像背景下,多任务YOLO通过C2f模块和强大的特征提取主干网络,优于Deeplabv3和SegNet模型,有效缓解导航线提取困难。与YOLO系列其他算法相比,改进后的多任务YOLO在准确率和mAP50上显著提升,证明了其作为果园农药自动喷施机器人优化解决方案的重要性。Multi-task YOLO performs well in the navigation line area segmentation task. The comparative experimental results show that although it may not be as good as PP-liteseg, it has advantages as a multi-task integrated algorithm. In the complex background of drone orchard images, multi-task YOLO outperforms Deeplabv3 and SegNet models through the C2f module and a powerful feature extraction backbone network, effectively alleviating the difficulty of navigation line extraction. Compared with other algorithms in the YOLO series, the improved multi-task YOLO has significantly improved in accuracy and mAP50, proving its importance as an optimization solution for automatic pesticide spraying robots in orchards.
本发明利用无人机收集的果园图像和果树靶标数据集,提出一种增强的多任务YOLO网络,用于导航线区域提取和果树靶标检测,为无人驾驶除草机提供导航。通过C2f模块改进模型结构,减少锚定依赖,提高检测和特征提取能力。实验结果显示,改进后的模型在训练准确率和mIoU上分别提高了4.27%和2.50%。此外,本发明还通过一系列图像处理操作提取了基于作物行和田间道路的导航线,并将其转换为实际经纬度信息。与专业设备定位信息相比,误差仅为5.472cm。The present invention uses orchard images and fruit tree target datasets collected by drones to propose an enhanced multi-task YOLO network for navigation line area extraction and fruit tree target detection, providing navigation for unmanned weeders. The model structure is improved through the C2f module, anchor dependency is reduced, and detection and feature extraction capabilities are improved. Experimental results show that the improved model improves training accuracy and mIoU by 4.27% and 2.50%, respectively. In addition, the present invention also extracts navigation lines based on crop rows and field roads through a series of image processing operations and converts them into actual longitude and latitude information. Compared with the positioning information of professional equipment, the error is only 5.472cm.
采用上述结构的优点在于:以往针对导航和靶标检测的研究,主要集中在车辆实时视觉导航或基于靶标检测的行间提取上。然而,地面机器人由于视点较低,在感知上存在局限性,现有的靶标对象检测和路径提取方法仅对稀疏的果园有效,对于背景复杂、果树种植密集交替的果园,无法有效提取。所以本发明将无人机(UAV)视角与新型多任务YOLO算法相结合,提出了一种可用于果园施药车的基于果园导航区域提取和果树靶标获取的实际经纬度坐标信息管理导航方法。首先,改进的多任务YOLO网络在无人机图像中分割果园的施药器械可穿越区域并检测果树靶标的位置。随后,对分割的可遍历区域进行导航线提取,包括区域边缘提取、轮廓区域化、滤波运算、导航线提取等一系列操作,获取当航线。为了生成可执行信息,使用软件将识别的树间航线像素点和果树靶标定位像素转换为实际的纬度和经度信息,供施药车执行。改进的多任务YOLO网络采用C2f和无锚模块对多任务YOLO进行增强。本发明研究结果表明,与原始模型相比,改进后的多任务YOLO网络训练精度提高了4.27%,mIoU提高了2.50%。这些研究结果表面,本发明提出的导航方法和靶标定位方法可以有效地解决地面机器人感知能力有限和定位问题。The advantage of adopting the above structure is that the previous research on navigation and target detection mainly focused on real-time visual navigation of vehicles or inter-row extraction based on target detection. However, due to the low viewpoint, ground robots have limitations in perception. The existing target object detection and path extraction methods are only effective for sparse orchards. For orchards with complex backgrounds and dense and alternating fruit trees, effective extraction is not possible. Therefore, the present invention combines the perspective of unmanned aerial vehicles (UAVs) with the new multi-task YOLO algorithm, and proposes a navigation method for managing actual longitude and latitude coordinate information based on orchard navigation area extraction and fruit tree target acquisition that can be used for orchard spraying vehicles. First, the improved multi-task YOLO network segments the orchard spraying equipment traversable area in the drone image and detects the position of the fruit tree target. Subsequently, the segmented traversable area is subjected to navigation line extraction, including a series of operations such as regional edge extraction, contour regionalization, filtering operation, and navigation line extraction to obtain the current route. In order to generate executable information, software is used to convert the identified inter-tree route pixels and fruit tree target positioning pixels into actual latitude and longitude information for the spraying vehicle to execute. The improved multi-task YOLO network uses C2f and anchor-free modules to enhance the multi-task YOLO. The research results of the present invention show that compared with the original model, the training accuracy of the improved multi-task YOLO network is improved by 4.27% and the mIoU is improved by 2.50%. These research results show that the navigation method and target positioning method proposed in the present invention can effectively solve the limited perception ability and positioning problems of ground robots.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410484889.9A CN118397483A (en) | 2024-04-22 | 2024-04-22 | Fruit tree target positioning and navigation line region segmentation method based on YOLO network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410484889.9A CN118397483A (en) | 2024-04-22 | 2024-04-22 | Fruit tree target positioning and navigation line region segmentation method based on YOLO network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118397483A true CN118397483A (en) | 2024-07-26 |
Family
ID=91987241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410484889.9A Pending CN118397483A (en) | 2024-04-22 | 2024-04-22 | Fruit tree target positioning and navigation line region segmentation method based on YOLO network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118397483A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119107581A (en) * | 2024-07-31 | 2024-12-10 | 华北科技学院(中国煤矿安全技术培训中心) | A method for identifying coal mine water hazard signs based on video monitoring |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220301224A1 (en) * | 2021-03-16 | 2022-09-22 | Shanghai United Imaging Healthcare Co., Ltd. | Systems and methods for image segmentation |
CN116665176A (en) * | 2023-07-21 | 2023-08-29 | 石家庄铁道大学 | Multi-task network road target detection method for vehicle automatic driving |
CN117274674A (en) * | 2023-09-05 | 2023-12-22 | 江苏大学 | Target application method, electronic device, storage medium and system |
CN117292278A (en) * | 2023-09-28 | 2023-12-26 | 华南农业大学 | Digital orchard fruit tree positioning method, device, equipment and medium |
CN117710764A (en) * | 2023-11-24 | 2024-03-15 | 中国重汽集团济南动力有限公司 | Training method, device and medium for multi-task perception network |
-
2024
- 2024-04-22 CN CN202410484889.9A patent/CN118397483A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220301224A1 (en) * | 2021-03-16 | 2022-09-22 | Shanghai United Imaging Healthcare Co., Ltd. | Systems and methods for image segmentation |
CN116665176A (en) * | 2023-07-21 | 2023-08-29 | 石家庄铁道大学 | Multi-task network road target detection method for vehicle automatic driving |
CN117274674A (en) * | 2023-09-05 | 2023-12-22 | 江苏大学 | Target application method, electronic device, storage medium and system |
CN117292278A (en) * | 2023-09-28 | 2023-12-26 | 华南农业大学 | Digital orchard fruit tree positioning method, device, equipment and medium |
CN117710764A (en) * | 2023-11-24 | 2024-03-15 | 中国重汽集团济南动力有限公司 | Training method, device and medium for multi-task perception network |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119107581A (en) * | 2024-07-31 | 2024-12-10 | 华北科技学院(中国煤矿安全技术培训中心) | A method for identifying coal mine water hazard signs based on video monitoring |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Fu et al. | Fast and accurate detection of kiwifruit in orchard using improved YOLOv3-tiny model | |
Pretto et al. | Building an aerial–ground robotics system for precision farming: an adaptable solution | |
US20210150184A1 (en) | Target region operation planning method and apparatus, storage medium, and processor | |
Kim et al. | Path detection for autonomous traveling in orchards using patch-based CNN | |
JP2022520019A (en) | Image processing methods, equipment, mobile platforms, programs | |
Bargoti et al. | A pipeline for trunk detection in trellis structured apple orchards | |
Cao et al. | Improved real-time semantic segmentation network model for crop vision navigation line detection | |
CN109886155B (en) | Single-plant rice detection and positioning method, system, equipment and medium based on deep learning | |
CN113920474B (en) | Internet of things system and method for intelligently supervising citrus planting situation | |
Sun et al. | Semantic segmentation and path planning for orchards based on UAV images | |
Mousavi et al. | A novel enhanced VGG16 model to tackle grapevine leaves diseases with automatic method | |
Biglia et al. | 3D point cloud density-based segmentation for vine rows detection and localisation | |
CN113065562A (en) | Crop ridge row extraction and leading route selection method based on semantic segmentation network | |
CN117392382A (en) | Single tree fruit tree segmentation method and system based on multi-scale dense instance detection | |
CN118397483A (en) | Fruit tree target positioning and navigation line region segmentation method based on YOLO network | |
Zheng et al. | Autonomous navigation method of jujube catch-and-shake harvesting robot based on convolutional neural networks | |
Popescu et al. | Orchard monitoring based on unmanned aerial vehicles and image processing by artificial neural networks: a systematic review | |
Zhang et al. | A review of vision-based crop row detection method: Focusing on field ground autonomous navigation operations | |
Mathivanan et al. | Utilizing satellite and UAV data for crop yield prediction and monitoring through deep learning | |
Juan et al. | Rapid density estimation of tiny pests from sticky traps using Qpest RCNN in conjunction with UWB-UAV-based IoT framework | |
Wei et al. | Precise extraction of targeted apple tree canopy with YOLO-Fi model for advanced UAV spraying plans | |
CN115294562B (en) | A method for intelligent perception of working environment of plant protection robot | |
Mao et al. | UAV-based high-throughput phenotyping to segment individual apple tree row based on geometrical features of poles and colored point cloud | |
Lu et al. | Farmland boundary extraction based on the AttMobile-DeeplabV3+ network and least squares fitting of straight lines | |
Lin et al. | Multi-task deep convolutional neural network for weed detection and navigation path extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |