CN118397603A - Pallet recognition system and pallet recognition method for mobile robot - Google Patents
Pallet recognition system and pallet recognition method for mobile robot Download PDFInfo
- Publication number
- CN118397603A CN118397603A CN202410596710.9A CN202410596710A CN118397603A CN 118397603 A CN118397603 A CN 118397603A CN 202410596710 A CN202410596710 A CN 202410596710A CN 118397603 A CN118397603 A CN 118397603A
- Authority
- CN
- China
- Prior art keywords
- pallet
- feature vector
- sparse
- image
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 239000013598 vector Substances 0.000 claims description 153
- 238000003708 edge detection Methods 0.000 claims description 39
- 238000012545 processing Methods 0.000 claims description 37
- 238000000605 extraction Methods 0.000 claims description 27
- 239000011159 matrix material Substances 0.000 claims description 23
- 238000005457 optimization Methods 0.000 claims description 20
- 238000010586 diagram Methods 0.000 claims description 15
- 230000004927 fusion Effects 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 3
- 238000003860 storage Methods 0.000 description 11
- 238000013135 deep learning Methods 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 238000005286 illumination Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013450 outlier detection Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/513—Sparse representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The application provides a pallet identification system and method of a mobile robot, and relates to the field of pallet identification. Thus, the automatic recognition of the pose information of the pallet is realized, and the pose recognition accuracy is improved.
Description
Technical Field
The present application relates to the field of pallet identification, and more particularly, to a pallet identification system and method for a mobile robot.
Background
In the scenes of automatic warehouses, logistics centers, production lines and the like, accurate knowledge of the pose information of the pallet is a basis for realizing automatic carrying, storage and identification. The position and posture information of the traditional pallet is obtained by installing an encoder on a machine and feeding back the encoder, and errors can be accumulated in the long-time running of the encoder, so that the accuracy of the position and posture information is gradually reduced.
Thus, there is a need for an optimized pallet identification scheme for mobile robots.
Disclosure of Invention
The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides a pallet identification system and a pallet identification method for a mobile robot, which adopt an image processing technology based on deep learning, wherein a front image of a pallet is firstly obtained through a binocular camera, then the obtained pallet image is subjected to feature extraction to obtain relevant features of the pallet, then the relevant features of the pallet are subjected to sparse optimization, and finally the relevant features of the pallet after the sparse optimization are input into a generator to obtain pose information of the pallet. Thus, the automatic recognition of the pose information of the pallet is realized, and the pose recognition accuracy is improved.
According to an aspect of the present application, there is provided a pallet recognition method of a mobile robot, including:
Acquiring a pallet front image shot by a left camera and a pallet front image shot by a right camera;
Performing feature extraction on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain pallet feature vectors;
And obtaining the pose information of the pallet based on the pallet feature vector.
According to another aspect of the present application, there is provided a pallet recognition system of a mobile robot, including:
The pallet related data acquisition module is used for acquiring a pallet front image shot by the left camera and a pallet front image shot by the right camera;
The pallet related data feature coding module is used for carrying out feature extraction on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain pallet feature vectors;
And the pallet pose information result generation module is used for obtaining the pose information of the pallet based on the pallet feature vector.
Compared with the prior art, the pallet identification system and the pallet identification method for the mobile robot adopt an image processing technology based on deep learning, firstly, a front image of a pallet is acquired through a binocular camera, then, feature extraction is carried out on the acquired pallet image to obtain pallet related features, then, sparse optimization is carried out on the pallet related features, and finally, the sparse optimized pallet related features are input into a generator to obtain pose information of the pallet. Thus, the automatic recognition of the pose information of the pallet is realized, and the pose recognition accuracy is improved.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.
Fig. 1 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application.
Fig. 2 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application, where feature extraction is performed on a pallet front image captured by the left camera and a pallet front image captured by the right camera to obtain pallet feature vectors.
Fig. 3 is a flowchart of feature extraction of the left and right binarized pallet images to obtain left and right pallet SIFT feature vectors in the pallet recognition method of the mobile robot according to the embodiment of the present application.
Fig. 4 is a flowchart of a pallet recognition method of a mobile robot according to an embodiment of the present application, based on the pallet feature vector, to obtain pose information of a pallet.
Fig. 5 is a flowchart of a method for identifying a pallet of a mobile robot according to an embodiment of the present application, in which a sparsity constraint based on a model parameter space is performed on the pallet feature vector to obtain a sparse optimized pallet feature vector.
Fig. 6 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application, in which the pallet feature vector is processed based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector.
Fig. 7 is a system block diagram of a pallet identification system of a mobile robot according to an embodiment of the present application.
Detailed Description
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
In modern scenes such as busy automatic warehouses, efficient logistics centers, advanced production lines and the like, accurate knowledge of the pose information of the pallet is undoubtedly a base stone for guaranteeing smooth logistics operation and correct automatic operation. The pallet is used as an important tool for storing and transporting articles, and the accurate identification of the position and the gesture is important for realizing the rapid transportation, the accurate storage and the effective identification of the articles. However, conventional pallet pose information acquisition approaches typically rely on encoders mounted on the machine. These encoders provide the system with pallet position and attitude information by constantly feeding back data. However, with long-term operation of the encoder, some error accumulation inevitably occurs. These errors may result from various factors such as wear of mechanical parts, changes in the external environment, or aging of electronic components. Over time, these errors accumulate gradually, eventually leading to a significant drop in the accuracy of the pallet pose information. In practice, such errors may lead to serious consequences. For example, during automated handling, if the system fails to accurately identify the position and attitude of the pallet, it may cause the goods to be misplaced, bumped, or even dropped, resulting in damage to the goods and production interruption. Thus, an optimized pallet identification scheme for mobile robots is desired.
In recent years, deep learning and neural networks have been widely used in the fields of computer vision, natural language processing, text signal processing, and the like. In addition, deep learning and neural networks have also shown levels approaching and even exceeding humans in the fields of image classification, object detection, semantic segmentation, text translation, and the like. The development of deep learning and neural networks provides new solutions and solutions for pallet identification of mobile robots.
Fig. 1 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application. As shown in fig. 1, a pallet recognition method of a mobile robot according to an embodiment of the present application includes: s110, acquiring a pallet front image shot by a left camera and a pallet front image shot by a right camera; s120, extracting features of the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain pallet feature vectors; s130, based on the pallet feature vector, pose information of the pallet is obtained.
Specifically, in the technical scheme of the application, firstly, the front image of the pallet shot by the left camera and the front image of the pallet shot by the right camera are acquired. It should be well understood that stereoscopic information of the pallet can be obtained by photographing the same pallet with two cameras at different angles (typically at a distance in the horizontal direction). Each camera captures images of the pallet from different perspectives, providing different cues as to the depth and shape of the pallet in space. Features such as edges, corner points and the like of the pallet can be extracted from images of the left camera and the right camera. Then, corresponding characteristic points in the left image and the right image are found through a characteristic matching algorithm, the matched characteristic points provide the position relation of the pallet in the three-dimensional space, and the position and the gesture of the pallet relative to the mobile robot can be calculated by combining internal parameters and external parameters of the camera. It is also worth mentioning that the use of a binocular vision system may reduce the impact of problems such as occlusion, shadowing or uneven illumination that may be encountered by a single camera. Because even if one camera is disturbed by these factors, the other camera may still be able to capture valid information.
Fig. 2 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application, where feature extraction is performed on a pallet front image captured by the left camera and a pallet front image captured by the right camera to obtain pallet feature vectors. As shown in fig. 2, the feature extraction of the pallet front image captured by the left camera and the pallet front image captured by the right camera to obtain pallet feature vectors includes: s121, performing preliminary processing on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain a left binarized pallet image and a right binarized pallet image; s122, extracting features of the left binarized pallet image and the right binarized pallet image to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector; s123, fusing the left pallet SIFT feature vector and the right pallet SIFT feature vector to obtain the pallet feature vector.
The images acquired by the left camera and the right camera are interfered by noise and color information in the images, if the original images are directly used for feature extraction and pose estimation, the extracted features are inaccurate, and the pose estimation accuracy is further affected, so that preliminary processing is needed for the originally acquired data, namely, image binarization processing is carried out on the pallet front image shot by the left camera and the pallet front image shot by the right camera so as to obtain a left binarized pallet image and a right binarized pallet image. Binarization can be performed by setting a threshold value to divide pixels in the front image of the pallet into black and white, so that noise and details in the image are eliminated, and image information is simplified. This helps to highlight the main features of the pallet and improve the accuracy of subsequent feature extraction. And the image data volume of the binarized pallet is greatly reduced, so that the image processing speed can be remarkably improved, and the mobile robot can more rapidly identify the pallet and calculate the pose of the pallet.
In an embodiment of the present application, performing image binarization processing on the pallet front image captured by the left camera and the pallet front image captured by the right camera to obtain a left binarized pallet image and a right binarized pallet image may be: a. firstly, an image processing library (such as OpenCV) is used for reading the front images of a pallet shot by a left camera and a right camera, ensuring that the image format is correct and can be loaded correctly, and then a color image is converted into a gray image, wherein gray is a common step before binarization, because the data volume of the image can be reduced, and enough shape and texture information is reserved at the same time; b. according to the actual condition of the pallet image, a proper threshold setting method is selected, a common method comprises a fixed threshold method, a self-adaptive threshold method and the like, the fixed threshold method is simple and quick, but is possibly not applicable to all scenes, the self-adaptive threshold method can automatically adjust the threshold according to the local characteristics of the image, is generally more flexible, and the threshold for binarization is determined according to the selected threshold method, and can effectively separate the pallet from the background; c. comparing each pixel value in the gray image with a set threshold value, and if the pixel value is greater than or equal to the threshold value, setting the pixel to be white; otherwise, setting black, and obtaining a left binarization pallet image and a right binarization pallet image after the comparison and assignment operation.
Fig. 3 is a flowchart of feature extraction of the left and right binarized pallet images to obtain left and right pallet SIFT feature vectors in the pallet recognition method of the mobile robot according to the embodiment of the present application. As shown in fig. 3, the feature extraction of the left and right binarized pallet images to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector includes: s1221, respectively extracting Canny edge detection diagrams of the left binarization pallet image and the right binarization pallet image to obtain a left pallet Canny edge detection diagram and a right Canny edge detection diagram; s1222, respectively carrying out operation based on an S IFT algorithm on the left pallet Canny edge detection diagram and the right Canny edge detection diagram to obtain the left pallet SIFT feature vector and the right pallet SIFT feature vector.
Feature extraction and pose estimation are directly performed on the binarized image, and inaccuracy of feature extraction and deviation of pose estimation may be caused by that some non-critical details or noise information still exists in the image. Therefore, further processing is required for the binarized pallet image, that is, the Canny edge detection map of the left binarized pallet image and the Canny edge detection map of the right binarized pallet image are extracted respectively to obtain a left pallet Canny edge detection map and a right Canny edge detection map. Canny edge detection is a widely applied edge detection algorithm, can effectively extract edge information in an image, and is important for identifying the shape and the position of a pallet. Canny edge detection is able to identify and emphasize edges in images, which are often key information for shape and location recognition. In the application, edge detection is carried out on the basis of the binarized image, so that the outline of the pallet can be further highlighted, and the feature extraction is more accurate. And by Canny edge detection, some non-edge noise information can be filtered out, so that the interference on subsequent feature matching and pose estimation is reduced, and the accuracy and stability of pose estimation are improved. Although Canny edge detection itself requires a certain calculation time, the amount of data to be considered in subsequent processing can be reduced by extracting edge information, so that the general steps of Canny edge detection of the overall processing speed are improved to a certain extent: firstly, removing noise by adopting a Gaussian filter, then, calculating gradients and directions of the smoothed image by adopting a sobel operator, wherein the direction calculation is generally that an angle=tan-1 (vertical gradient/horizontal gradient), the directions of the gradients are generally always vertical to the boundary, and the gradient directions are often classified into four types, namely vertical, horizontal and two diagonal lines; then after the gradient and the direction are obtained, traversing the image, removing all points which are not boundaries, traversing the pixel points one by one, judging whether the current pixel point is the maximum value of the gradient with the same direction in surrounding pixel points, and reserving the maximum value with the same direction; 4. hysteresis threshold 1: minVal, hysteresis threshold 2: maxVal gradient values are greater than the maximum threshold reservation, gradient values are less than the minimum threshold discard, maxVal gradient values > minVal, reservation linked to the boundary, discard not linked to the boundary.
In an embodiment of the present application, the extracting the Canny edge detection map of the left binary pallet image and the right binary pallet image to obtain the left pallet Canny edge detection map and the right Canny edge detection map may be: a. respectively importing the left binarized pallet image and the right binarized pallet image into an image processing library, such as OpenCV, which provides the realization of Canny edge detection; the canny edge detection algorithm requires two thresholds: a low threshold for determining the starting point of the edge and a high threshold for tracking the edge, the selection of which depends on the noise level of the image and the sharpness of the edge, these thresholds being typically determined experimentally or adaptively; c. applying a Canny edge detection algorithm to the left and right binarized pallet images, typically accomplished by calling a cv2.canny () function in OpenCV; d. after Canny edge detection, two new images are obtained, which respectively represent the edge detection results of the left pallet and the right pallet. These images will contain only detected edge information, typically displayed as white lines.
And then, respectively carrying out operation based on an S IFT algorithm on the left pallet Canny edge detection diagram and the right Canny edge detection diagram to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector. The SIFT (Scale-INVARIANT FEATURE TRANSFORM) algorithm is a widely used algorithm in computer vision for extracting unique, distinguishable features from images. These features remain unchanged when the scale, rotation and illumination conditions are changed, and are therefore particularly suitable for tasks such as image recognition, object tracking and three-dimensional reconstruction. The SIFT algorithm has a major role in image processing in that it can extract stable keypoints from images and calculate the direction of those keypoints. Each keypoint is associated with a 128-dimensional feature descriptor that contains gradient information for pixels around the keypoint, such that it is invariant to scale, rotation, and illumination changes of the image. Therefore, SIFT features can maintain their uniqueness even under different viewing angles or in the face of noise interference, and thus are effectively used for image matching and object recognition. The operation based on the SIFT algorithm is performed on the Canny edge detection diagram of the pallet, because the Canny edge detection diagram can highlight the edge information of the pallet, and the SIFT algorithm can extract stable and unique characteristic points from the edge information. By combining the two methods, the position of the pallet in the image can be more accurately positioned, and key features for subsequent pose estimation can be extracted. If the SIFT algorithm-based operation is not performed, the pose estimation is performed only by relying on the Canny edge detection graph, and inaccurate or failed matching may be caused by lack of stable feature points. The SIFT algorithm can remarkably improve the accuracy and stability of pose estimation through unique feature description and a matching mechanism.
In an embodiment of the present application, performing an operation based on a SIFT algorithm on the left pallet Canny edge detection map and the right Canny edge detection map to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector may be: a. firstly introducing a SIFT algorithm, wherein in an OpenCV library, the SIFT algorithm is realized by cv2.SIFT_create (), then initializing SIFT objects, adjusting some parameters of the SIFT algorithm, such as a contrast threshold, an edge threshold, a sigma value and the like, so as to optimize the effect of feature extraction, and then using the SIFT objects to respectively detect feature points of a left pallet channel edge detection diagram and a right pallet channel edge detection diagram, wherein the feature points are respectively returned to two lists which respectively contain key point objects in left pallet images and right pallet images; b. for each detected key point, calculating corresponding 128-dimensional feature descriptors by using a SIFT algorithm, wherein the descriptors contain gradient information of pixels around the key point, so that the descriptors have invariance to scale, rotation and illumination change, and correlating the calculated feature descriptors with key point objects; c. and combining the key point objects and the corresponding feature descriptors thereof into feature vectors, wherein each feature vector comprises position information (such as x and y coordinates) of the key point and 128-dimensional feature descriptors, and extracting the feature vectors of the right pallet of the left pallet respectively.
The left pallet and the right pallet capture information of different sides or view angles of the pallet respectively, the mobile robot can be limited to a certain extent due to the change of conditions such as illumination and shielding, and the information of a certain view angle is used independently, so that the information of two view angles is required to be fused for enriching the characteristic description of the pallet, namely the left pallet SIFT feature vector and the right pallet SIFT feature vector are fused in the technical scheme of the application to obtain the pallet feature vector. The fused pallet feature vector contains richer and comprehensive information, so that the position and the posture of the pallet can be more accurately determined in the subsequent posture estimation process.
In an embodiment of the present application, an implementation manner of fusing the left pallet SIFT feature vector and the right pallet SIFT feature vector to obtain a pallet feature vector may be: correlating the left pallet SIFT feature vector and the right pallet SIFT feature vector by using a fusion formula to obtain the pallet feature vector; wherein, the fusion formula is:
Vc=αVa⊕βVb
Wherein V c is the pallet feature vector, va is the left pallet SIFT feature vector, V b is the right pallet SIFT feature vector, "" indicates that elements at corresponding positions of the left pallet SIFT feature vector and the right pallet SIFT feature vector are added, and α and β are weighting parameters for controlling balance between the left pallet SIFT feature vector and the right pallet SIFT feature vector in the pallet feature vector.
Fig. 4 is a flowchart of a pallet recognition method of a mobile robot according to an embodiment of the present application, based on the pallet feature vector, to obtain pose information of a pallet. As shown in fig. 4, the obtaining the pose information of the pallet based on the pallet feature vector includes: s131, sparsity constraint based on a model parameter space is carried out on the pallet feature vector so as to obtain a sparse optimization pallet feature vector; and S132, the sparse optimization pallet feature vector is passed through a pallet pose generator to obtain a generation result, wherein the generation result is pose information of the pallet.
In particular, in the solution of the present application, it is considered that the image binarization process simplifies the image by converting the image into black and white, but this process may lose some important image details, and may also amplify image noise. The Canny edge detection algorithm is sensitive to image noise, and if noise is present in the image, the Canny algorithm may erroneously identify the noise as an edge, thereby introducing outliers. SIFT (scale invariant feature transform) algorithms may be affected by image noise and outliers when extracting image features, resulting in extracted feature vectors containing noise. When the left pallet SIFT feature vector and the right pallet SIFT feature vector are fused, if the fusion strategy is improper, mutual interference may occur, and noise and abnormal values in the feature vectors are amplified. If the pallet pose generator (which may be a machine learning model) is insufficiently trained, or the training dataset is small, the model may learn a wrong pattern, resulting in an overfitting. During image processing and feature extraction, outliers can seriously affect the quality of feature vectors if there is no appropriate outlier detection and processing mechanism. Therefore, in order to eliminate noise and abnormal values of the pallet feature vector, in the technical scheme of the application, sparsity constraint based on a model parameter space is carried out on the pallet feature vector.
Specifically, fig. 5 is a flowchart of a method for identifying a pallet of a mobile robot according to an embodiment of the present application, where sparsity constraint based on a model parameter space is performed on the pallet feature vector to obtain a sparsity-optimized pallet feature vector. As shown in fig. 5, the performing sparsity constraint on the pallet feature vector based on the model parameter space to obtain a sparsity optimization pallet feature vector includes: s1311, extracting a model parameter space, wherein the model parameter space comprises a model weight matrix and a model bias vector; s1312, sparsity constraint based on regularization terms is carried out on the model weight matrix and the model bias vector so as to obtain a sparse model weight matrix and a sparse model bias vector; s1313, processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector.
Here, the sparsity constraint based on the model parameter space is performed by the pallet feature vector, so that only a few elements in the pallet feature vector can be ensured to be nonzero, and feature selection and optimization are realized. The method is not only helpful for simplifying the model structure and making the model more transparent and easier to explain, but also can reduce the overfitting of the model to training data and enhance the generalization capability of the model. Furthermore, the sparse pallet feature vector means that the model will rely on fewer features to make decisions, which not only reduces the computational cost, but also improves the robustness of the model in the face of noise and outliers. In implementing the sparsity constraint based on the model parameter space, the degree of sparsity can be explicitly controlled, so that a proper balance point is found between the model complexity and the prediction performance. The method can assist in feature selection, highlight features which are most critical to classification tasks, and protect sensitive information in data at the same time, so that privacy safety is ensured.
More specifically, the processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector is used for: processing the pallet feature vector to obtain the sparse optimized pallet feature vector by the following formula; wherein, the formula is:
Vsparse=WTVc+B
Wherein V c represents the pallet feature vector, W represents the sparse model weight matrix, B represents the sparse model bias vector, T represents the transpose of the matrix, and V sparse represents the sparse optimization pallet feature vector.
More specifically, fig. 6 is a flowchart of a pallet identification method of a mobile robot according to an embodiment of the present application, where the pallet feature vector is processed based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector. As shown in fig. 6, the processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector includes: s13131, creating a controller class, wherein the controller class is used for processing sparsity constraint requests; s13132, in response to receiving the sparsity constraint request, extracting the sparse model weight matrix, the sparse model bias vector, and the pallet feature vector from the sparsity constraint request; s13133, using the controller class, using the sparse model weight matrix and the sparse model bias vector, and performing sparse constraint on the pallet feature vector to obtain the sparse optimization pallet feature vector; s13134, returning the sparse optimization pallet feature vector.
It should be well understood that implementing "apply sparsity constraint" using Spring Web mechanisms has the following benefits a.spring Web provides an extensible framework that can easily handle high concurrency requests; the spring Web supports a plurality of deployment options, including an application server and a cloud platform, and flexibility is provided; spring Web provides built-in security functions, such as authentication and authorization, to protect applications from unauthorized access; the spring Web follows the principle of loose coupling and dependence injection, so that the application program is easy to maintain and expand; e. the deployment and management of sparsity constraints can be simplified using Spring Web mechanisms, as Spring Web provides out-of-box functionality such as request processing, model loading, and response generation.
In an embodiment of the present application, based on the sparse model weight matrix and the sparse model bias vector, one code implementation manner of processing the pallet feature vector to obtain the sparse optimized pallet feature vector may be:
and finally, the sparse optimization pallet feature vector is passed through a pallet pose generator to obtain a generation result, wherein the generation result is pose information of the pallet. The sparse optimized pallet feature vector, while containing rich pallet information, does not itself directly represent the position and pose of the pallet in three-dimensional space. To obtain this information, further processing and parsing of the feature vectors is required using a pallet pose generator. The structure of the pallet pose generator in the application is to generate a countermeasure network (GAN), generate pose information of the pallet through the generator network, and evaluate and optimize the generated result through a discriminator network. The generated result provides the position and posture information of the pallet, and can help the mobile robot to accurately position and identify the pallet, thereby realizing automatic carrying tasks.
In an embodiment of the present application, the sparse optimization pallet feature vector is passed through a pallet pose generator to obtain a generation result, where the generation result is pose information of a pallet, and one possible implementation manner of the generation result may be: the pallet pose generator comprises a generator network and a judging device network, wherein the generator network is used for generating pose information of pallets, the judging device network is used for calculating differences between the generated pallet pose information and real pallet pose information, and network parameters of the generator network are updated in a gradient descending mode to obtain the pallet pose generator with the generated pallet pose information.
In summary, the pallet identification method of the mobile robot according to the embodiment of the application is explained, which adopts an image processing technology based on deep learning, firstly, a front image of a pallet is obtained through a binocular camera, then, feature extraction is performed on the obtained pallet image to obtain pallet related features, then, sparse optimization is performed on the pallet related features, and finally, the sparse optimized pallet related features are input into a generator to obtain pose information of the pallet. Thus, the automatic recognition of the pose information of the pallet is realized, and the pose recognition accuracy is improved.
Fig. 7 is a system block diagram of a pallet identification system of a mobile robot according to an embodiment of the present application. As shown in fig. 7, a pallet recognition system 100 of a mobile robot according to an embodiment of the present application includes: the pallet related data acquisition module 110 is configured to acquire a pallet front image captured by the left camera and a pallet front image captured by the right camera; the pallet related data feature encoding module 120 is configured to perform feature extraction on the pallet front image captured by the left camera and the pallet front image captured by the right camera to obtain a pallet feature vector; and the pallet pose information result generating module 130 is configured to obtain pose information of the pallet based on the pallet feature vector.
Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the pallet recognition system 100 of the mobile robot described above have been described in detail in the above description of the pallet recognition method of the mobile robot with reference to fig. 1 to 6, and thus, repetitive descriptions thereof will be omitted.
In summary, the pallet recognition system 100 of the mobile robot according to the embodiment of the application is illustrated, which adopts an image processing technology based on deep learning, firstly, a front image of a pallet is obtained through a binocular camera, then, feature extraction is performed on the obtained pallet image to obtain pallet related features, then, sparse optimization is performed on the pallet related features, and finally, the sparse optimized pallet related features are input into a generator to obtain pose information of the pallet. Thus, the automatic recognition of the pose information of the pallet is realized, and the pose recognition accuracy is improved.
As described above, the pallet recognition system 100 of the mobile robot according to the embodiment of the present application may be implemented in various wireless terminals, such as a server for pallet recognition of the mobile robot, and the like. In one example, the pallet identification system 100 of a mobile robot according to an embodiment of the present application may be integrated into a wireless terminal as one software module and/or hardware module. For example, the pallet identification system 100 of the mobile robot may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the pallet identification system 100 of the mobile robot may also be one of a number of hardware modules of the wireless terminal.
Alternatively, in another example, the mobile robot's pallet identification system 100 and the wireless terminal may be separate devices, and the mobile robot's pallet identification system 100 may be connected to the wireless terminal through a wired and/or wireless network and transmit interactive information in a agreed data format.
Finally, the application also relates to a computer readable storage medium for implementing a pallet identification method of a mobile robot according to one or more embodiments of the application. References herein to computer-readable storage media include various types of computer storage media, and can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, a computer-readable storage medium may comprise RAM, ROM, EPROM, E PROM, registers, hard disk, a removable disk, a CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage device, or any other temporary or non-temporary medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. The description of the computer-readable storage medium according to the present application can refer to the explanation of the pallet recognition method for the mobile robot according to the present application, and will not be repeated.
In the several embodiments provided by the present application, it should be understood that the disclosed method, system, or computer readable storage medium may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and other manners of division may be implemented in practice.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present application without departing from the spirit of the technical solution of the present application.
Claims (10)
1. A pallet identification method for a mobile robot, comprising:
Acquiring a pallet front image shot by a left camera and a pallet front image shot by a right camera;
Performing feature extraction on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain pallet feature vectors;
And obtaining the pose information of the pallet based on the pallet feature vector.
2. The pallet identification method of claim 1, wherein the feature extraction of the pallet front image captured by the left camera and the pallet front image captured by the right camera to obtain pallet feature vectors comprises:
Performing preliminary processing on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain a left binarized pallet image and a right binarized pallet image;
performing feature extraction on the left binarized pallet image and the right binarized pallet image to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector;
and fusing the left pallet SIFT feature vector and the right pallet SIFT feature vector to obtain the pallet feature vector.
3. The pallet identification method of claim 2, wherein performing preliminary processing on the pallet front image captured by the left camera and the pallet front image captured by the right camera to obtain a left binarized pallet image and a right binarized pallet image comprises: and performing image binarization processing on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain a left binarized pallet image and a right binarized pallet image.
4. The pallet identification method of claim 3, wherein performing feature extraction on the left and right binarized pallet images to obtain left and right pallet SIFT feature vectors comprises;
Respectively extracting Canny edge detection images of the left binarization pallet image and the right binarization pallet image to obtain a left pallet Canny edge detection image and a right Canny edge detection image;
And respectively carrying out SIFT algorithm-based operation on the left pallet Canny edge detection diagram and the right Canny edge detection diagram to obtain the left pallet SIFT feature vector and the right pallet SIFT feature vector.
5. The pallet identification method of the mobile robot according to claim 4, wherein the obtaining the pose information of the pallet based on the pallet feature vector comprises:
carrying out sparsity constraint on the pallet feature vector based on a model parameter space to obtain a sparse optimization pallet feature vector;
And the sparse optimization pallet feature vector is passed through a pallet pose generator to obtain a generation result, wherein the generation result is pose information of the pallet.
6. The pallet identification method of claim 5, wherein performing a model parameter space based sparsity constraint on the pallet feature vector to obtain a sparsity optimized pallet feature vector comprises:
extracting a model parameter space, wherein the model parameter space comprises a model weight matrix and a model bias vector;
sparsity constraint based on regularization terms is carried out on the model weight matrix and the model bias vector so as to obtain a sparse model weight matrix and a sparse model bias vector;
and processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimization pallet feature vector.
7. The pallet identification method of claim 6, wherein processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector is used to: processing the pallet feature vector to obtain the sparse optimized pallet feature vector by the following formula;
Wherein, the formula is:
Vsparse=WTVc+B
Wherein V c represents the pallet feature vector, W represents the sparse model weight matrix, B represents the sparse model bias vector, T represents the transpose of the matrix, and V sparse represents the sparse optimization pallet feature vector.
8. The mobile robot pallet identification method of claim 7, wherein processing the pallet feature vector based on the sparse model weight matrix and the sparse model bias vector to obtain the sparse optimized pallet feature vector comprises:
Creating a controller class, wherein the controller class is used for processing sparsity constraint requests;
In response to receiving the sparsity constraint request, extracting the sparse model weight matrix, the sparse model bias vector, and the pallet feature vector from the sparsity constraint request;
Using the controller class, using the sparse model weight matrix and the sparse model bias vector, and performing sparse constraint on the pallet feature vector to obtain the sparse optimization pallet feature vector;
and returning the sparse optimization pallet characteristic vector.
9. A pallet identification system for a mobile robot, comprising:
The pallet related data acquisition module is used for acquiring a pallet front image shot by the left camera and a pallet front image shot by the right camera;
The pallet related data feature coding module is used for carrying out feature extraction on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain pallet feature vectors;
And the pallet pose information result generation module is used for obtaining the pose information of the pallet based on the pallet feature vector.
10. The mobile robotic pallet identification system of claim 9, wherein the pallet-related data feature encoding module comprises:
the pallet image preliminary processing unit is used for carrying out preliminary processing on the pallet front image shot by the left camera and the pallet front image shot by the right camera to obtain a left binarized pallet image and a right binarized pallet image;
The binarization pallet image feature extraction unit is used for carrying out feature extraction on the left binarization pallet image and the right binarization pallet image to obtain a left pallet SIFT feature vector and a right pallet SIFT feature vector;
And the feature vector fusion unit is used for fusing the left pallet SIFT feature vector and the right pallet SIFT feature vector to obtain the pallet feature vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410596710.9A CN118397603A (en) | 2024-05-14 | 2024-05-14 | Pallet recognition system and pallet recognition method for mobile robot |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410596710.9A CN118397603A (en) | 2024-05-14 | 2024-05-14 | Pallet recognition system and pallet recognition method for mobile robot |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118397603A true CN118397603A (en) | 2024-07-26 |
Family
ID=91984763
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410596710.9A Pending CN118397603A (en) | 2024-05-14 | 2024-05-14 | Pallet recognition system and pallet recognition method for mobile robot |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118397603A (en) |
-
2024
- 2024-05-14 CN CN202410596710.9A patent/CN118397603A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11443454B2 (en) | Method for estimating the pose of a camera in the frame of reference of a three-dimensional scene, device, augmented reality system and computer program therefor | |
US11205276B2 (en) | Object tracking method, object tracking device, electronic device and storage medium | |
CN108960211B (en) | Multi-target human body posture detection method and system | |
Szentandrási et al. | Fast detection and recognition of QR codes in high-resolution images | |
CN110853033B (en) | Video detection method and device based on inter-frame similarity | |
KR101988384B1 (en) | Image matching apparatus, image matching system and image matching mehod | |
JP5261501B2 (en) | Permanent visual scene and object recognition | |
US9349194B2 (en) | Method for superpixel life cycle management | |
CN111144366A (en) | Strange face clustering method based on joint face quality assessment | |
US9767383B2 (en) | Method and apparatus for detecting incorrect associations between keypoints of a first image and keypoints of a second image | |
US20240020923A1 (en) | Positioning method based on semantic information, device and computer-readable storage medium | |
KR102226845B1 (en) | System and method for object recognition using local binarization | |
CN115375917B (en) | Target edge feature extraction method, device, terminal and storage medium | |
CN109636828A (en) | Object tracking methods and device based on video image | |
CN112907667A (en) | Visual laser fusion tray pose estimation method, system and device | |
Hadfield et al. | Hollywood 3d: what are the best 3d features for action recognition? | |
CN112802112B (en) | Visual positioning method, device, server and storage medium | |
CN112396654B (en) | Method and device for determining pose of tracked object in image tracking process | |
CN109785367B (en) | Method and device for filtering foreign points in three-dimensional model tracking | |
CN118397603A (en) | Pallet recognition system and pallet recognition method for mobile robot | |
KR102540290B1 (en) | Apparatus and Method for Person Re-Identification based on Heterogeneous Sensor Camera | |
CN115294358A (en) | Feature point extraction method and device, computer equipment and readable storage medium | |
KR102224218B1 (en) | Method and Apparatus for Deep Learning based Object Detection utilizing Video Time Information | |
KR20230060029A (en) | Planar surface detection apparatus and method | |
CN113674319A (en) | Target tracking method, system, equipment and computer storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |