CN113763470B - RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion - Google Patents
RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion Download PDFInfo
- Publication number
- CN113763470B CN113763470B CN202110914560.8A CN202110914560A CN113763470B CN 113763470 B CN113763470 B CN 113763470B CN 202110914560 A CN202110914560 A CN 202110914560A CN 113763470 B CN113763470 B CN 113763470B
- Authority
- CN
- China
- Prior art keywords
- line
- point
- line segment
- segments
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 61
- 238000010276 construction Methods 0.000 title claims abstract description 28
- 230000004927 fusion Effects 0.000 title claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 115
- 238000001514 detection method Methods 0.000 claims description 55
- 230000008569 process Effects 0.000 claims description 48
- 230000010354 integration Effects 0.000 claims description 22
- 230000004807 localization Effects 0.000 claims description 21
- 238000005457 optimization Methods 0.000 claims description 17
- 238000013507 mapping Methods 0.000 claims description 16
- 230000003287 optical effect Effects 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 14
- 238000005259 measurement Methods 0.000 claims description 11
- 238000012216 screening Methods 0.000 claims description 10
- 230000001629 suppression Effects 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims 1
- 238000005286 illumination Methods 0.000 abstract description 9
- 230000008859 change Effects 0.000 abstract description 6
- 230000000052 comparative effect Effects 0.000 description 22
- 230000033001 locomotion Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000001133 acceleration Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000005764 inhibitory process Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000005484 gravity Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 238000005452 bending Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- RZVHIXYEVGDQDX-UHFFFAOYSA-N 9,10-anthraquinone Chemical compound C1=CC=C2C(=O)C3=CC=CC=C3C(=O)C2=C1 RZVHIXYEVGDQDX-UHFFFAOYSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000003169 complementation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/05—Geographic models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Geometry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Remote Sensing (AREA)
- Artificial Intelligence (AREA)
- Computer Graphics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method, a device, equipment and a medium for simultaneously positioning RGBD visual inertia and constructing a map based on point-line feature fusion. The RGBD visual inertia simultaneous positioning and map construction method, device, equipment and medium based on the point-line feature fusion solve the problems of efficient state estimation and high-precision three-dimensional map construction of the autonomous robot under illumination change and low-texture indoor scenes, and have the advantages of high precision, high efficiency and the like.
Description
Technical Field
The invention relates to a system and a method for simultaneously positioning RGBD visual inertia and constructing a map based on point-line feature fusion, and belongs to the technical field of simultaneous positioning and map construction of autonomous robots.
Background
One of the goals of robots is autonomous operation in the real world, while positioning and map creation (Simultaneous Localization AND MAPPING, SLAM) systems are key technologies for autonomous robots.
Efficient state estimation and accurate three-dimensional map construction techniques of autonomous robots in illumination-changing and weakly textured indoor scenes face significant challenges. On one hand, in the indoor scene with illumination change and weak texture, the visual inertia simultaneous localization and map creation (SLAM) method based on the feature point method greatly reduces the localization precision due to the fact that a large number of effective feature points are difficult to extract, and even the system is completely invalid due to poor texture video, motion blur and the like.
Most of traditional line feature detection methods adopt EDLines algorithm as a line feature detection tool, but the algorithm is easy to detect excessive short line segment features when facing complex background or noisy images, so that calculation resources for line segment detection, description and matching are wasted, and the positioning accuracy is obviously reduced. In addition, the algorithm generally has the condition that too many adjacent similar line segments and long line segments are easily divided into a plurality of short line segments, so that the subsequent line segment matching task is complicated, and the uncertainty of the SLAM system is increased.
The three-dimensional dense point cloud map construction is very dependent on the accuracy of a pose estimation algorithm, and if pose estimation is not accurate enough, the constructed three-dimensional dense point cloud map is very easy to overlap and twist
For the reasons, the inventor conducts intensive research on the existing autonomous robot state estimation and three-dimensional dense point cloud map construction method, and provides an RGBD vision inertia simultaneous positioning and map construction system and method based on point-line feature fusion.
Disclosure of Invention
In order to overcome the problems, the inventor conducts intensive research and designs a RGBD visual inertia simultaneous localization and map construction method based on point-line feature fusion, which comprises a front-end visual odometer, a rear-end optimization and a three-dimensional map construction process, and further, in the front-end visual odometer process, the input information of the visual inertia simultaneous localization and map construction system is an image acquired by an RGBD camera and a detection value of an IMU.
Further, the front-end visual odometer is based on dotted line features, including feature detection and matching, IMU pre-integration and visual inertial alignment,
The feature detection and matching comprises the extraction and tracking of point features and the extraction and tracking of line features.
Further, the extraction and tracking of the line features comprises the sub-steps of: s11, suppressing the length of the line segment; s12, extracting line characteristics through near line combination and short line splicing; s13, tracking the line characteristics through an optical flow method. Preferably, step S13 comprises the sub-steps of: s131, matching anchor points; s132, matching points with lines; s133, matching the lines. Preferably, during the front-end visual odometer, there is also the step of: s14, parameterizing the characteristics of points and lines; and in the parameterization process of the point-to-point and line characteristics, error constraint is carried out on the line characteristics.
Preferably, the IMU pre-integration value is taken as a constraint between two consecutive camera frame images.
Preferably, in the back-end optimization process, all state variables in the sliding window are optimized by minimizing the sum of all measurement residuals.
The invention also provides a RGBD visual inertia simultaneous localization and map construction device based on the point-line feature fusion, which comprises a front-end visual odometer module, a rear-end optimization module and a three-dimensional map construction module,
The front-end visual inertial odometer module comprises a feature detection and matching sub-module, an IMU pre-integration sub-module and a visual inertial alignment sub-module, wherein the IMU pre-integration sub-module takes an IMU pre-integration value as a constraint between two frames of images, and the rear-end optimization module optimizes all state variables in a sliding window by minimizing the sum of all measurement residual errors.
The invention also provides an electronic device, comprising: at least one processor; and
A memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods described above.
The invention also provides a computer readable storage medium storing computer instructions for causing the computer to perform any one of the methods described above.
The invention has the beneficial effects that:
(1) According to the RGBD visual inertia simultaneous positioning and map construction system based on point-line feature fusion, provided by the invention, the problem of large correlation error of the front-end data of the visual inertia odometer based on the feature points in the weak texture scene is solved by introducing the line features, and the estimation precision of the motion between the adjacent frame images of the weak texture scene is effectively improved; (2) According to the RGBD visual inertia simultaneous localization and map construction system based on point-line feature fusion, provided by the invention, the problems of low quality of line segment extraction in the traditional algorithm are optimized while the rapid extraction is ensured by utilizing the length inhibition, near line merging and broken line splicing strategies, and the mismatching rate of the line features of the system is reduced; (3) According to the RGBD visual inertia simultaneous positioning and map construction system based on the point-line feature fusion, the light flow method is used for tracking the line features, so that the efficiency of a line feature tracking algorithm is remarkably improved; (4) According to the RGBD visual inertia simultaneous positioning and map construction system based on point-line feature fusion, high-efficiency state estimation and accurate three-dimensional point cloud construction can be realized in an indoor scene with illumination change and weak textures by constructing a high-precision dense point cloud map.
Drawings
FIG. 1 is a flow chart of a method for RGBD visual inertia simultaneous localization and mapping in accordance with a preferred embodiment of the present invention;
FIG. 2 shows the comparison of RGBD visual inertia simultaneous localization and line segment length suppression before and after detection in a mapping method according to a preferred embodiment of the present invention;
FIG. 3 is a schematic diagram of RGBD visual inertia simultaneous localization and mapping method in which multiple closer line segments are combined into one line segment according to a preferred embodiment of the present invention;
FIG. 4 is a schematic diagram showing the combination of RGBD visual inertia simultaneous localization and mapping of multiple shorter line segments into a long line segment in accordance with a preferred embodiment of the present invention;
FIG. 5 is a schematic diagram showing the process of matching points in an I 1 frame image in an I 0 frame image in a RGBD visual inertia simultaneous localization and mapping method according to a preferred embodiment of the present invention;
FIG. 6 is a schematic diagram showing a process of RGBD visual inertia simultaneous localization and mapping matching points to lines in a method of mapping, according to a preferred embodiment of the present invention;
FIG. 7 is a schematic diagram of a match line to line process in an RGBD visual inertia simultaneous localization and mapping method for point-line feature fusion in accordance with a preferred embodiment of the present invention;
FIG. 8 is a schematic diagram of the line feature re-projection errors in the RGBD visual inertial simultaneous localization and mapping method in accordance with a preferred embodiment of the present invention;
FIG. 9-a shows the results of the cafline feature detection in example 1; FIG. 9-b shows the results of the coffee shop point feature detection in example 1; FIG. 9-c shows the results of the cafe line detection and tracking in example 1;
FIG. 10-a shows the home 1 line feature detection result in example 1; FIG. 10-b shows the results of home 1-point feature detection in example 1; FIG. 10-c shows the results of home 1 straight line detection and tracking in example 1;
FIG. 11-a shows the home 2 line feature detection result in example 1; fig. 11-b shows the home 2-point feature detection result in example 1; FIG. 11-c shows the home 2 straight line detection and tracking results in example 1;
FIG. 12-a shows corridor line feature detection results in example 1; FIG. 12-b shows corridor point feature detection results in example 1; FIG. 12-c shows corridor line detection and tracking results in example 1;
FIG. 13-a shows a comparison of the results in the corridor scenario in example 1 with the true trajectory (Ground truth) and errors; FIG. 13-b shows the results of the corridor scenario in example 1 compared to the true trajectory (Ground truth) and error in top view; FIG. 13-c shows the results of the corridor scenario in example 1 compared with the true trajectory (Ground truth) and error under side view;
FIG. 14 shows the results of example 2 and comparative examples 3 to 5 in a scene Lab 14; FIG. 15 shows a translational error bin diagram on scene Lab 1; FIG. 16 shows a rotational error bin diagram on scene Lab 1;
FIG. 17 shows the position drift in scene Lab3 for example 2; fig. 18 shows the position drift in the scene Lab6 of example 2; fig. 19 shows three-dimensional maps obtained in comparative example 3 and example 2; fig. 20 shows three-dimensional maps obtained in comparative example 3 and example 2; fig. 21 shows three-dimensional maps obtained in comparative example 3 and example 2.
Detailed Description
The invention is further described in detail below by means of the figures and examples. The features and advantages of the present invention will become more apparent from the description.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The conventional visual inertia simultaneous localization and map creation (SLAM) method is mostly based on point feature detection and description, however, in indoor scenes with illumination change and weak textures, the localization accuracy is greatly reduced due to the fact that a large number of effective feature points are difficult to extract, and even the system is completely disabled due to poor textures such as video and motion blur.
In view of the above, the inventor introduces line feature and Inertial Measurement (IMU) data in the traditional SLAM method, and under the condition of low texture, the line feature detection is more robust than the point feature detection, and the inertial measurement data can still move in a good frequency under the condition of small feature quantity in an image sequence, so that the SLAM result of the robot is still more stable when the visual angle is significantly changed.
The RGBD visual inertia simultaneous positioning and map construction method based on the point-line feature fusion comprises a front-end visual odometer, a rear-end optimization and a three-dimensional map construction process.
In the front-end visual odometer process, the visual inertia is simultaneously positioned and input information of the map creation system is an image acquired by an RGBD camera and a detection value of the IMU.
The RGBD camera is a camera capable of simultaneously acquiring and outputting RGB information and depth information, is one of cameras commonly used for image recognition, and the IMU is a detection unit commonly used for a robot and is used for measuring a sensor for acceleration and angular velocity of movement of the robot.
Further, the front-end visual odometer is based on point and line features, including feature detection and matching, IMU pre-integration, visual inertial alignment, as shown in fig. 1.
The feature detection and matching comprises the extraction and tracking of point features and the extraction and tracking of line features.
In the invention, the method for extracting and tracking the point features is not particularly limited, and the method for extracting and tracking the point features in the traditional SLAM can be adopted, for example, the Shi-Tomasi corner points are extracted as the feature points, the KLT optical flow method is adopted to realize the tracking of the feature points, and the points with larger difference are tracked and eliminated based on the reverse optical flow method.
Wherein, the extraction of the Shi-Tomasi corner point adopts the method in paper J.Shi,C.Tomasi,Good features to track,in:1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,IEEE,1994,pp.593-600.; the KLT optical flow method adopts the method in paper B.D.Lucas and T.Kanade,An iterative image registration technique with an application to stereo vision,in:Proceedings of the 7th International Joint Conference on Artificial Intelligence(IJCAI),IEEE,1981,pp.24-28; the reverse optical flow method uses the method in paper Baker S,Matthews I.Lucas-kanade 20years on:A unifying framework[J].International journal of computervision,2004,56(3):221-255.
In the invention, compared with the traditional SLAM system, the extraction and tracking of the line characteristic are added in the front-end visual odometer, and the additional constraint is provided for the system by adding the line characteristic, so that the accuracy of system state estimation is improved.
In the prior art, the extraction methods of the line features include an LSD method, an FLD method, a EDlines method and the like, wherein the detection speeds of the LSD method and the FLD method are low, and the timeliness requirement of SLAM cannot be met; although the EDlines method has higher detection speed and higher detection efficiency, when facing a low-texture scene or a noise image, a large number of shorter line features are always detected, so that calculation resources of line detection, description and matching are wasted, and the positioning accuracy is obviously reduced.
In the SLAM system formed by fusing the point and the line features, since the line features are only used for providing additional constraint to improve the precision of state estimation, the EDLines method is improved in the invention, so that only obvious line features in a scene are detected while the original detection efficiency is maintained, and a more effective line feature detection result is obtained.
Specifically, in the present invention, the extraction and tracking of line features includes the sub-steps of:
S11, suppressing the length of the line segment; s12, extracting line characteristics through near line combination and short line splicing; s13, tracking line features through an optical flow method; preferably, after step S12, there may also be step S14, point, line feature parameterization.
Because the longer line segment is more stable, the detection can be repeated in a plurality of frames, the positioning accuracy is also increased along with the increase of the number of the longer line segments, and in the step S11, the line segments with the length lower than the threshold value are deleted through the length inhibition of the line segments, so that the quality of the extracted line segments is effectively improved. The comparison result of the detection effects before and after the line segment length inhibition is shown in fig. 2.
Specifically, the line segment length suppression is expressed as:
Wherein len i represents the length of the ith line segment in the image, len min represents the threshold value of the shortest line characteristic length, W I is the width of the image, H I is the height of the image, and the symbol Represents a rounding-up, η is a scaling factor, (x sp,ysp) represents the start coordinates of the line segment, and (x ep,yep) represents the end coordinates of the line segment.
In the invention, the minimum value in the image width and the image height is taken as one of key indexes set by the threshold value, so that the line segment length inhibition can adapt to different image sizes.
Preferably, the scaling factor η is 0.01 to 0.2, more preferably 0.1, and through a lot of experiments, the inventor finds that the scaling factor can maximally guarantee the quality of line segment detection.
Considering that longer line segment features tend to come from a continuous region with a larger gradient where the detection accuracy is more reliable, it is necessary to merge adjacent line segments so that the obtained line segments are longer.
In step S12, the more recent line segments are combined into one line segment, i.e. the near line combination, as shown in FIG. 3, and
A plurality of shorter line segments are spliced into a long line segment, i.e., a short line splice, as shown in fig. 4.
As shown in FIG. 3, in the present invention, L i、Lj、Lk represents the line segments to be spliced, and their end points are respectively
As shown in FIG. 4, in the present invention, L u、Lv represents two line segments to be spliced, and their end points are respectively
Preferably, in step S12, the following sub-steps are included for the near line merging and the short line splicing:
S121, sequencing the wire segments; s122, screening the wire segments; s123, filtering the screened line segment groups to obtain a merging candidate line segment set and a splicing candidate line segment set; s124, merging or splicing the line segments.
In step S121, the line segments are ordered by line segment length, preferably in descending order from long to short, to obtain a line segment sequence { L 1,L2,..,Ln }, where L 1 represents the line segment with the longest line segment length.
Since longer line segments tend to come from image areas with continuously strong gradients, it is more reliable to start with the longest line segment L 1.
Further, the line segments outside the L 1 are represented as a remaining line segment set
L= { L 2,L3,...,Ln } (two)
Where n is the total number of segments.
In step S122, the screening includes an angle screening, which may be expressed as:
Wherein, L α represents a candidate line segment group obtained by angle screening, L m represents different line segments in the rest line feature set, and the angle of the line segment L m in the horizontal direction is θ m,θmin, which is the angle threshold for measuring the similarity of the line segments.
The inventors have determined through a number of experiments that θ min is optimal at pi/90.
In step S123, the obtained candidate line segment group L α is filtered, where the filtering in the near line merging process is expressed as:
the merging candidate line segment set can be obtained through the filtering process
The filtering of the stub stitching process is expressed as:
the filtering process can obtain a spliced candidate line segment set
In the fourth and fifth formulas, d min represents a threshold value for measuring the line segment proximity, and the threshold value is preferably 5 to 15 pixels, more preferably 9 pixels, through a large number of experimental analyses.
Through the filtering process, line segments with angles and spatial distances close to L 1 can be collected into a group, and a candidate line segment set L β={Lβ1,Lβ2 is obtained.
In step S124, a line segment L 1 is added to the filtered candidate line segment set L β to form a new line segment set { L 1,Lβ }, and the head end point and the tail end point which are farthest from each other in the line segment set are selected as the start point and the end point of the new line segment, so as to synthesize a new line segment L M.
The resulting line segment L M angle theta M is recalculated,
If theta M<θmin is satisfied, merging or splicing the segment set { L 1,Lβ } and replacing the segment set { L 1,Lβ } with the synthesized segment L M;
If theta M<θmin is not satisfied, the combination or the splicing is abandoned, the angle difference between the front angle and the back angle of the combination is overlarge, and the synthetic result deviates from the original line segment.
Repeating the above process, and merging and splicing all the line segments in the image.
The step S12 can obtain a line segment set in each frame, and how to determine the correspondence between the line segments in two continuous frames of images, so as to track the line segments is one of the problems to be solved by the invention.
In step S13, the points on the line segment are matched by an optical flow method, and then the matched points are counted to realize the tracking of the line segment.
Optical flow methods are commonly used for tracking point features, and in the present invention, they are modified to enable tracking of line segments.
Specifically, the method comprises the following substeps: s131, matching anchor points; s132, matching points with lines; s133, matching the lines.
In step S131, the anchor points are interval points in the line segments, and one line segment is characterized by a plurality of anchor points.
In the present invention, two consecutive frame images are denoted as I 0 and I 1, and further, a line segment set in an I 0 frame image is denoted as:
L0 Frame(s) =0li 0li=(0pi,0qi),0≤i≤M0, Wherein M 0 represents the total number of line segments in the I 0 frame image, 0li represents the I-th line segment in the I 0 frame image, and 0pi,0qi represents the two end points of the I-th line segment in the I 0 frame image;
Likewise, the set of line segments in the I 1 frame image is represented as:
L1 Frame(s) =1lk|1lk=(1pk,1qk),0≤k≤M1, Where M 1 represents the total number of line segments in the I 1 frame image, 1lk represents the kth line segment in the I 1 frame image, and 1pk,1qk represents the two endpoints of the kth line segment in the I 1 frame image.
For each line segment 0li=(0pi,0qi in the I 0 frame image), the start point, the end point, and several points in the direction of the start point and the end point of the line segment are selected as anchor points, and as shown in fig. 5, the anchor point set is 0Qi=0λi,j|0≤j<Ni,0λi,j, where N i represents the j-th anchor point on the I-th line segment, and N i represents the total number of anchor points on the I-th line segment, preferably 12.
Further, the vector in the direction of the start point and the end point is:
0ai=[0ai,0,0ai,1]T=(0qi-0pi)/||0qi-0pi||.
Further, in the I 1 frame image, a matching point 1λi,j of each anchor point 0λi,j in the I 0 frame image in a normal direction 0ni may be obtained, where the normal direction is:
0ni=[-0ai,1,0ai,0]T
In step S122, in the frame of I 1, the shortest distance from each matching point 1λi,j to a different line segment 1lk is obtained, a distance set d k is obtained, the shortest distance in the distance set is denoted as d min, the line segment corresponding to the shortest distance is denoted as 1lclosett(i,j), as shown in fig. 6,
Further, if d min<dthreshold, then consider that the matching point 1λi,j belongs to line segment 1lclosest(i,j);
Otherwise, the matching point 1λi,j is not considered to belong to the line segment 1lclosest(i,j), and the next anchor point is judged;
where d threshold is an empirical threshold, preferably 1 pixel.
In step S133, as shown in fig. 7, for each anchor point on any line segment 0li,0≤i<M0 in the I 0 frame image, there is a unique matching point 1λi,j in the I 1 frame image, which falls on line segment 1lclosest(i,j). For line segment 0li within the I 0 frame, the number of anchor points it matches within the I 1 frame is calculated, let Z i be the number of anchor points that match line segment 1lmatch(i), if Z i/Ni > R, define 0li and 1lmatch(i)0≤match(i)<M1 match each other, where R is an empirical threshold, preferably 0.6 pixels.
The inventors have found that, although a line feature based solution is a good choice in low texture scenes lacking point features, line feature occlusion or false detection tends to occur in repeated or low texture scenes. In the present invention, the above-described problems are solved by line characteristic error constraint.
In step S14, the point-line characteristics are parameterized, and in the parameterization process, error constraints are applied to the line characteristics.
Specifically, in the present invention, the error of the point feature and the line feature is constrained as follows:
Wherein, as shown in figure 8, For the point projection error term,The error term is a line projection error term; t Iw represents the state of the RGBD camera in the image of the I frame in the world coordinate system, X w,j represents the j point observed by the I frame, and X i,j is the projection of X w,j on the plane; p w,k,Qw,k denotes both end points of the I-th line segment in the I-th frame, P i,k,qi,k is a point on the image plane corresponding to P w,k,Qw,k,For the projection endpoint of P w,k,Qw,k on the image plane,For p i,k to projected line segment/>, in world coordinate systemDistance ofFor q i,k to projected line segment/>, under world coordinate systemIs a distance of (2); pi () is a camera projection model, pi h () is homogeneous coordinates of the camera projection model, and further:
Where O is a point coordinate in the RGBD camera coordinate system, P w is a point coordinate in the world coordinate system, R w represents a coordinate rotation, t w represents a coordinate translation, f x,fy represents a focal length of the RGBD camera, and c x,cy represents a principal point of the RGBD camera.
Further, the coordinate of the camera coordinate system corresponding to P w,k is O p,Qw,k and the coordinate of the camera coordinate system corresponding to O q can be obtained by the following formula:
Line segment The jacobian matrix for the reprojection error for endpoint P w,k,Qw,k is:
Wherein, Is the component of O p on the x, y and z coordinate axes of the camera coordinate system; /(I)Is the component of O q on the x, y and z coordinate axes of the camera coordinate system;
further, the method comprises the steps of,
L p is the camera optical center directionI q is the camera optical center pointingIs a vector of (a).
In the conventional SLAM, most of the SLAM adopts a BA-based optimization method, and the IMU pre-integration process needs to update the frame speed and rotation every time the state is adjusted, and needs to re-integrate after each iteration, which makes the transmission strategy very time-consuming.
IMU measurements typically include three orthogonal axis accelerometers and a three axis gyroscope for measuring acceleration and angular velocity relative to the inertial frame. Under the influence of noise, IMU measurements typically contain an additive white noise n and a time-varying bias b, denoted as:
where a t represents the ideal value of three orthogonal axis accelerometers without noise, Representing acceleration bias,Rotation g w representing world to body coordinate system represents gravitational acceleration in world coordinate system, n a represents accelerometer noise, ω t represents ideal value for three-axis gyroscope without bias,Representing gyroscope bias, n w represents gyroscope noise.
Further, accelerometer noise n a and gyroscope noise n w may be represented as gaussian white noise versions:
Further, acceleration biasing The derivative of (2) and the derivative of the gyroscope bias can be expressed as gaussian white noise forms:
in the invention, the IMU pre-integration value is used as a constraint between two frames of images and is used together with the camera re-projection residual error in the process of local window optimization.
Specifically, b k、bk+1 is two continuous image frames, b k corresponds to time t k,bk+1, and the following constraint is performed between time t k+1,tk and time t k+1 by using IMU pre-integration value:
Wherein, Representing the translation of b k frames to b k+1 frames,Representing the speed of b k frames to b k+1 frames,Representing the rotation of b k frames to b k+1 frames,Representing the rotation of τ frames to b k frames,Representing the rotation of the frame from time t to bk, τ represents an intermediate variable in time during the double integration process.
Further, Ω (·) can be represented as:
ω≡represents an antisymmetric matrix.
In the invention, the effect of correcting the influence of the bias variation is achieved by calculating the jacobian matrix of each pre-integration term when the pre-integration calculation is performed between two continuous camera frames. Further, if the estimated deviation change is large, the pre-integration process is re-propagated under the new deviation value, through the process, a large amount of calculation amount is saved, the calculation efficiency is improved, and the positioning and map construction speed is improved.
The visual inertia alignment adopts a loose coupling scheme of vision and an IMU, firstly, a structural motion method (SFM) is utilized to estimate the posture of a camera and the space position of a three-dimensional point, and then the spatial position is aligned with IMU pre-integration data to obtain the gravity direction, the scale factor, the gyroscope bias and the corresponding speed of each frame.
According to the present invention, in the back-end optimization process, unlike the conventional SLAM method, all state variables in the sliding window are optimized by minimizing the sum of all measurement residuals, and the process can be expressed as:
Wherein, For IMU measurement residual errors from I frame to I+1 frame in the body coordinate system, χ b is the set of all IMU pre-integrals in the sliding window,Is the reprojection error of the point feature,X p is the set of point features observed in the I frame, x l is the set of line features observed in the I frame, e prior is a priori information, and ρ () is a Cauchy function, which plays a role in suppressing outliers.
In the three-dimensional mapping process, the local point cloud is spliced into a final point cloud map by means of the estimated camera pose, which is the same as the traditional SLAM method.
The accurate pose estimation plays a vital role in the precision of the point cloud image. If the pose estimation is not accurate enough, the final point cloud may overlap or warp. In the invention, as the characteristic tracking nodes are adopted to track the characteristics on the RGB image and acquire the distance information from the depth image, another scale information source is acquired by utilizing the depth image, thereby improving the robustness and the accuracy of the system and ensuring that the reconstructed three-dimensional cloud image is more reliable.
Further, in the invention, three-dimensional points are projected from each depth image, and are synchronized with key frames, so that corresponding gestures are estimated, a cloud picture is constructed, the system is compact, and the system is more suitable for small-sized robot integration. Moreover, the point cloud image of the type can be directly used for semantic segmentation or recognition and further used for obstacle avoidance and path planning.
On the other hand, the invention also discloses a device which comprises a front-end visual odometer module, a rear-end optimization module and a three-dimensional graph building module.
The front-end visual inertial odometer module comprises a feature detection and matching sub-module, an IMU pre-integration sub-module and a visual inertial alignment sub-module.
The feature detection and matching submodule is used for extracting and tracking point features and extracting and tracking line features.
Further, in the feature detection and matching sub-module, the line features are extracted and tracked through steps S11 to S13.
The IMU pre-integration sub-module takes the IMU pre-integration value as a constraint between two frames of images, and is used together with a camera re-projection residual error in the optimization process of a local window, wherein the constraint is expressed as:
The visual inertia alignment sub-module estimates the posture of the camera and the space position of the three-dimensional point by using a structural motion method (SFM), and then aligns with IMU pre-integration data to obtain the gravity direction, the scale factor, the gyroscope bias and the corresponding speed of each frame.
The back-end optimization module optimizes all state variables in the sliding window by minimizing the sum of all measurement residuals, the process of which can be expressed as:
The three-dimensional map building module splices the local point clouds into a final point cloud map by means of the estimated camera gestures.
The various embodiments of the methods and apparatus described above in this invention may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present invention may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the methods and apparatus described herein may be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The methods and apparatus described herein may be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called as a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical hosts and VPS service (Virtual PRIVATE SERVER or VPS for short). The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, so long as the desired result of the technical solution of the present disclosure is achieved, and the present disclosure is not limited herein.
Examples
Example 1
Simulation experiments of simultaneous localization and map creation of the sense of inertia are performed using a public dataset, wherein the dataset is OpenLORIS-Scene, and the dataset is collected in 5 scenes of offices, hallways, families, cafes and markets, which are very typical autonomous robot application scenes.
The simultaneous localization and map creation of the sense of inertia comprises a front-end visual odometer, a rear-end optimization and a three-dimensional map creation process.
In the front-end visual odometer process, point features and line features are extracted and tracked, wherein when the point features are extracted and tracked, shi-Tomasi corner points are extracted as feature points, the KLT optical flow method is adopted to track the feature points, and points with larger differences are tracked and eliminated based on the reverse optical flow method;
The line feature extraction tracking includes the sub-steps of:
S11, suppressing the length of the line segment; s12, extracting line characteristics through near line combination and short line splicing; s13, tracking line features through an optical flow method; s14, parameterizing the point and line characteristics.
In step S11, the line segment length suppression is expressed as:
Eta takes a value of 0.1.
Step S12 comprises the following sub-steps: s121, sequencing the wire segments; s122, screening the wire segments; s123, filtering the screened line segment groups to obtain a merging candidate line segment set and a splicing candidate line segment set; s124, merging or splicing the line segments.
In step S121, the segments are ordered in the order of length of the segments from long to short, and in step S122, the filtering is angle filtering, which is expressed as:
Wherein, θ min is pi/90.
In step S123, the obtained candidate line segment group L α is filtered, where the filtering in the near line merging process is expressed as:
The filtering of the stub stitching process is expressed as:
d min denotes a threshold value for measuring the line segment proximity, which is 9 pixels.
The step S13 includes the following sub-steps: s131, matching anchor points; s132, matching points with lines; s133, matching the lines.
Wherein the total number of anchor points in sub-step S131 is 12.
In step S14, parameterizing the point and line features, and in the parameterizing process, performing error constraint on the line features, and constraining errors of the point features and the line features as follows:
In the front-end visual odometer process, the IMU pre-integration value is used as the constraint between two frames of images, and is used together with the camera re-projection residual error in the local window optimization process, so that the following constraint is carried out:
In the front-end visual odometer process, visual inertia alignment adopts a loose coupling scheme of vision and an IMU, firstly, a structural motion method (SFM) is utilized to estimate the posture of a camera and the spatial position of a three-dimensional point, and then the spatial position is aligned with IMU pre-integral data to obtain the gravity direction, the scale factor, the gyroscope bias and the corresponding speed of each frame.
In the back-end optimization process, all state variables in the sliding window are optimized by minimizing the sum of all measurement residuals, which process can be expressed as:
in the three-dimensional mapping process, the local point clouds are spliced into a final point cloud map by means of the estimated camera gestures.
FIGS. 9-12 are point and line feature detection in a plurality of low texture and illumination variation scenarios in OpenLORIS-Scene datasets, where FIG. 9 is a cafe detection result, FIG. 10 is a home 1 detection result, FIG. 11 is a home 2 detection result, and FIG. 12 is a hallway detection result; FIGS. 9-a, 10-b, 11-c, 12-d show the results of line feature detection, and FIGS. 9-a, 10-b, 11-c, 12-d show the results of point feature detection.
It can be seen that the indoor scenes are mostly artificial structure scenes, have abundant linear structures such as edges and right angles, are not abundant in characteristics (family 1 and family 2) in low-texture scenes (cafes and corridors) with strong illumination changes, and are not enough in point characteristic detection, but the line characteristic detection algorithm can still obtain abundant line characteristics. Compared with point detection, the point-line characteristic complementation method can provide more abundant and robust characteristic information for subsequent motion estimation.
9-A, 10-b, 11-c, 12-d show straight line detection and tracking under different scenes, more effective information can be obtained by utilizing a straight line feature matching algorithm, the geometric structure can be described more intuitively, and richer and more robust information is provided for subsequent vision measuring ranges to estimate camera motion according to comprehensive features.
Fig. 13-a is a comparison of the results in a corridor scenario with the true trajectory (Ground truth) and errors, where the color is represented by the blue-to-red color with the error increasing gradually, and fig. 13-b, 13-c are the results in a top view and a side view.
As the speed of motion increases, the viewing angle may vary significantly due to the introduction of more motion blur, and the illumination conditions may be challenging to track the features and performance may degrade. As can be seen from the figure, this embodiment enables accurate alignment of the real trajectory, and especially drift free tracking can be achieved when the camera is rotating fast, which means that adding depth information to the visual inertial system improves the pose estimation considerably.
Example 2
And (3) performing a positioning mapping experiment in an indoor environment, and verifying positioning accuracy according to the data of the motion capture system OptiTrack.
The experiment was carried out as an unmanned aerial vehicle equipped with intel REALSENSE D435i depth camera, intel NUC on-board computer and pixhawkv5+ flight control. The D435i depth camera can capture 640 x 800 resolution color images, and the camera is internally provided with a six-axis IMU sensor. In the experiment, the sampling frequencies of the camera and IMU were set to 30Hz and 200Hz, respectively. And the onboard computer processes the image and IMU data acquired by the depth inertial camera in real time.
14 Different scenes (Lab 1-Lab 14) are built indoors, and the 14 scenes have the characteristics of low texture, illumination change, rapid acceleration and angle rotation.
The onboard computer locates the pictures acquired by the depth camera in a map construction as in example 1.
Comparative example 1
The same experiment as in example 1 was performed, except that the ORB-SLAM2 method in paper R.Mur-Artal,J.D.Tardós,ORB-SLAM2:An open-source SLAMsystem for monocular,stereo,and RGB-Dcameras,IEEE Transactions on Robotics 33(5)(2017)1255-1262. was used, ORB-SLAM2 is a point-based monocular SLAM system, which can be operated in large-scale, small-scale, indoor and outdoor environments by adding RGBD interfaces.
Comparative example 2
The same experiment as in example 1 was performed, except that the PL-VINS method in paper Q.Fu,J.Wang,H.Yu,I.Ali,F.Guo,Y.He,H.Zhang,PL-VINS:Real time monocular visual-inertial SLAMwith point and line features(2020).arXiv:2009.07462. was used, which was developed on the basis of point-based VINS-M0no, a monocular visual inertial system optimization method with point and line features.
Comparative example 3
The same experiment as in example 2 was performed, except that the VINS-RGBD method in paper Z.Shan,R.Li,S.Schwertfeger,RGBD-inertial trajectory estimation and mapping for ground robots,Sensors 19(10)(2019)2251. was used.
Comparative example 4
The same experiment as in example 2 was performed, except that the ORB-SLAM2 method in paper R.Mur-Artal,J.D.Tardós,ORB-SLAM2:An open-source SLAMsystem for monocular,stereo,and RGB-Dcameras,IEEE Transactions on Robotics 33(5)(2017)1255-1262. was used.
Comparative example 5
The same experiment as in example 2 was performed, except that the PL-VINS method in paper Q.Fu,J.Wang,H.Yu,I.Ali,F.Guo,Y.He,H.Zhang,PL-VINS:Real time monocular visual-inertial SLAMwith point and line features(2020).arXiv:2009.07462. was used.
Experimental example
Experimental example 1
Comparative example 1 and comparative examples 1 and 2 were evaluated using Root Mean Square Error (RMSE), maximum error (max error) and mean error (mean error), and the results are shown in table one.
List one
As can be seen from the first table, the RMSE in the embodiment 1 is obviously lower than that in the scenes of the comparative examples 1, market1-1 and corridor-4, the comparative example 2 has tracking faults and serious drift, a final track cannot be obtained, and simulation results show that the embodiment 1 has better robustness.
In addition, in the case of rapid motion, i.e., in the sparse feature sequences Office 1-7, example 1 still maintains a high accuracy of positioning of about 0.18m, which is about 44% higher than comparative example 1.
In addition, the positioning error fluctuation of the embodiment 1 is small, the consistency of positioning estimation is good, and the robust accurate positioning is realized in a low-texture environment.
Experimental example 2
The results of the RMSE (units: meters) of comparative example 2 and comparative examples 3 to 5 are shown in Table II.
Watch II
From Table II, it can be seen that example 1 maintains advantages in terms of low indoor texture, accuracy of illumination variation, and that example 1 has minimal overall ripple, high stability, and root mean square error kept within 0.44 meters.
The results of example 2 and comparative examples 3-5 in scene Lab14 are shown in FIG. 14, and it can be seen from the graph that the positioning trace of example 2 is closer to OptiTrack, and the positioning error is significantly smaller than that of comparative examples 3-5.
Fig. 15 shows a translational error bin diagram on the scene Lab1, and fig. 16 shows a rotational error bin diagram on the scene Lab 1. As can be seen from the graph, the upper and lower set value distributions of example 2 are smaller than those of comparative examples 3 to 5.
Fig. 17 shows the position drift of embodiment 2 in the scene Lab3, from which it can be seen that the triaxial error is concentrated within 300mm, and fig. 18 shows the position drift of embodiment 2 in the scene Lab6, from which it can be seen that the triaxial error is concentrated within 200mm, illustrating that the variation of embodiment 2 is smooth and the positioning accuracy and stability are good.
Fig. 19 to 21 show three-dimensional maps obtained in comparative example 3 and example 2, in which the left column shows the results obtained in comparative example 3 and the right column shows the results obtained in example 2, and it can be seen from the figures that comparative example 3 has a large error in the horizontal and vertical directions, and double shadows and bending phenomena occur during the splicing process, resulting in lower map construction accuracy. In embodiment 2, the map stitching effect is greatly improved, and meanwhile, the problems of double shadows and bending are solved, which also shows that the accumulated error in the map construction process is reduced, and embodiment 2 can well complete the construction of the indoor map.
The invention has been described above in connection with preferred embodiments, which are, however, exemplary only and for illustrative purposes. On this basis, the invention can be subjected to various substitutions and improvements, and all fall within the protection scope of the invention.
Claims (5)
1. The RGBD visual inertia simultaneous positioning and map construction method based on point-line feature fusion comprises front-end visual odometer, rear-end optimization and three-dimensional map construction process, and is characterized in that,
In the front-end visual odometer process, the visual inertia is simultaneously positioned and input information of the map creation system is an image acquired by an RGBD camera and a detection value of an IMU;
The front-end visual odometer is based on dotted line features, including feature detection and matching, IMU pre-integration and visual inertial alignment,
The feature detection and matching comprises the extraction and tracking of point features and the extraction and tracking of line features;
the extraction and tracking of the line features comprises the following sub-steps:
s11, suppressing the length of the line segment;
s12, extracting line characteristics through near line combination and short line splicing;
s13, tracking line features through an optical flow method;
In S11, the line segment length suppression is expressed as:
Wherein len i represents the length of the ith line segment in the image, len min represents the threshold value of the shortest line characteristic length, W I is the width of the image, H I is the height of the image, and the symbol Representing an upward rounding, η being a scaling factor, (x sp,ysp) representing the start coordinates of the line segment and (x ep,yep) representing the end coordinates of the line segment;
in step S12, the method comprises the following sub-steps of performing near line merging and short line splicing:
S121, sequencing the wire segments; s122, screening the wire segments; s123, filtering the screened line segment groups to obtain a merging candidate line segment set and a splicing candidate line segment set; s124, merging or splicing the wire segments;
In step S121, the line segments are ordered according to the line segment lengths, so as to obtain a line segment sequence { L 1,L2,...,Ln }, where L 1 represents the line segment with the longest line segment length;
In step S122, the screening includes an angle screening, which may be expressed as:
Wherein L α represents a candidate line segment group obtained by angle screening, L m represents different line segments in the residual line feature set, the angle of the line segment L m in the horizontal direction is theta m,θmin, which is an angle threshold for measuring the similarity of the line segments, and L is a set of line segments outside L 1;
In S123, the filtering of the near line merging process is expressed as:
The filtering of the stub stitching process is expressed as:
Where d min denotes a threshold value for measuring line segment proximity, Representing a set of merging candidate line segments,Representing a set of segment candidates to be spliced, L i、Li、Lk represents segments to be spliced, and their end points are/>, respectively L u、Lv represents two line segments to be spliced, and the endpoints of the line segments are/>, respectively
Combining the line segments with angles and space distances close to L 1 into a group to obtain a candidate line segment set L β={Lβ1,Lβ2;
In S124, adding line segment L 1 to the candidate line segment set L β after screening to form a new line segment set { L 1,Lβ }, selecting the head end point and the tail end point which are farthest from each other in the line segment set as the start point and the end point of the new line segment, and synthesizing a new line segment L M;
repeating the above process, and merging and splicing all line segments in the image;
step S13 comprises the following sub-steps:
S131, matching anchor points;
s132, matching points with lines;
s133, matching the lines;
In step S131, the anchor points are interval points in line segments, one line segment is represented by a plurality of anchor points, two continuous frame images are represented as I 0 and I 1, for each line segment 0li=(0pi,0qi in the I 0 frame image, a start point, an end point of the line segment, and a plurality of points in the direction of the start point and the end point are selected as anchor points, and in the I 1 frame image, a matching point lλi,j in a normal direction 0ni of each anchor point 0λi,j in the I 0 frame image is obtained, where the normal direction is:
0ni=[-0ai,1,0ai,0]T
the vector of the starting point and the ending point is:
0ai=[0ai,0,0ai,1]T=(0qi-0pi)/||0qi-0pi||
In step S132, in the frame of I 1, the shortest distance from each matching point 1λi,j to different line segments 1lk is obtained, a distance set d k is obtained, the shortest distance in the distance set is denoted as d min, the line segment corresponding to the shortest distance is denoted as 1lclosest(i,j), and if d min<dthreshold, the matching point 1λi,j is considered to belong to the line segment 1lclosest(i,j);
Otherwise, the matching point 1λi,j is not considered to belong to the line segment 1lclosest(i,j), and the next anchor point is judged;
Wherein d threshold is an empirical threshold;
In step S133, for the line segment 0li in the I 0 frame, calculating the number of anchor points that it matches in the I 1 frame, setting Z i as the number of anchor points that match with the line segment 1lmatch(i), if Z i/Ni > R, then 0li and 1lmatch(i)0≤match(i)<M1 match each other, where R is an empirical threshold, and M 1 represents the total number of line segments in the I 1 frame image;
In the front-end visual odometer process, the steps are also provided:
S14, parameterizing the characteristics of points and lines;
In the parameterization process of the point and line characteristics, error constraint is carried out on the line characteristics, and the error constraint of the point characteristics and the line characteristics is as follows:
Wherein, For the point projection error term,The error term is a line projection error term; t Iw represents the state of the RGBD camera in the I-frame image in the world coordinate system, and X w,j represents the j-th point observed by the I-frame; p w,k,Qw,k denotes two end points of the ith line segment in the I frame, P i,k,qi,k is a point on the image plane corresponding to P w,k,Qw,k,Is the projection endpoint of P w,k,Qw,k on the image plane,For p i,k to projected line segment/>, in world coordinate systemDistance ofFor q i,k to projected line segments in world coordinate systemIs a distance of (2); pi () is the camera projection model, pi h () is the homogeneous coordinates of the camera projection model.
2. The RGBD visual inertia simultaneous localization and mapping method based on point-line feature fusion of claim 1, wherein,
The IMU pre-integration value is taken as a constraint between two consecutive camera frame images.
3. The RGBD visual inertia simultaneous localization and mapping method based on point-line feature fusion of claim 1, wherein,
In the back-end optimization process, all state variables in the sliding window are optimized by minimizing the sum of all measurement residuals.
4. An electronic device, comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-3.
5. A computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110914560.8A CN113763470B (en) | 2021-08-10 | 2021-08-10 | RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110914560.8A CN113763470B (en) | 2021-08-10 | 2021-08-10 | RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113763470A CN113763470A (en) | 2021-12-07 |
CN113763470B true CN113763470B (en) | 2024-06-07 |
Family
ID=78789041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110914560.8A Active CN113763470B (en) | 2021-08-10 | 2021-08-10 | RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113763470B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024112266A1 (en) * | 2022-11-25 | 2024-05-30 | Nanyang Technological University | Visual odometry methods and systems |
CN117197211B (en) * | 2023-09-04 | 2024-04-26 | 北京斯年智驾科技有限公司 | Depth image generation method, system, device and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019157925A1 (en) * | 2018-02-13 | 2019-08-22 | 视辰信息科技(上海)有限公司 | Visual-inertial odometry implementation method and system |
CN112381890A (en) * | 2020-11-27 | 2021-02-19 | 上海工程技术大学 | RGB-D vision SLAM method based on dotted line characteristics |
CN112432653A (en) * | 2020-11-27 | 2021-03-02 | 北京工业大学 | Monocular vision inertial odometer method based on point-line characteristics |
CN112683305A (en) * | 2020-12-02 | 2021-04-20 | 中国人民解放军国防科技大学 | Visual-inertial odometer state estimation method based on point-line characteristics |
CN112802196A (en) * | 2021-02-01 | 2021-05-14 | 北京理工大学 | Binocular inertia simultaneous positioning and map construction method based on dotted line feature fusion |
-
2021
- 2021-08-10 CN CN202110914560.8A patent/CN113763470B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019157925A1 (en) * | 2018-02-13 | 2019-08-22 | 视辰信息科技(上海)有限公司 | Visual-inertial odometry implementation method and system |
CN112381890A (en) * | 2020-11-27 | 2021-02-19 | 上海工程技术大学 | RGB-D vision SLAM method based on dotted line characteristics |
CN112432653A (en) * | 2020-11-27 | 2021-03-02 | 北京工业大学 | Monocular vision inertial odometer method based on point-line characteristics |
CN112683305A (en) * | 2020-12-02 | 2021-04-20 | 中国人民解放军国防科技大学 | Visual-inertial odometer state estimation method based on point-line characteristics |
CN112802196A (en) * | 2021-02-01 | 2021-05-14 | 北京理工大学 | Binocular inertia simultaneous positioning and map construction method based on dotted line feature fusion |
Non-Patent Citations (2)
Title |
---|
一种融合点线特征的视觉里程计架构设计与定位实现;赵嘉珩等;北京理工大学学报;20190531;全文 * |
双目点线特征与惯导融合的机器人SLAM 算法研究;冯波等;小型微型计算机系统;20210630;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113763470A (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Monocular object and plane slam in structured environments | |
CN109166149B (en) | Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU | |
CN112634451B (en) | Outdoor large-scene three-dimensional mapping method integrating multiple sensors | |
Yang et al. | Pop-up slam: Semantic monocular plane slam for low-texture environments | |
WO2021119024A1 (en) | Interior photographic documentation of architectural and industrial environments using 360 panoramic videos | |
CN105783913A (en) | SLAM device integrating multiple vehicle-mounted sensors and control method of device | |
CN111311666A (en) | Monocular vision odometer method integrating edge features and deep learning | |
Kim et al. | Direct semi-dense SLAM for rolling shutter cameras | |
Matsuki et al. | Codemapping: Real-time dense mapping for sparse slam using compact scene representations | |
CN112802196B (en) | Binocular inertia simultaneous positioning and map construction method based on dotted line feature fusion | |
CN113763470B (en) | RGBD visual inertia simultaneous positioning and map construction with point-line feature fusion | |
CN205426175U (en) | Fuse on -vehicle multisensor's SLAM device | |
Xia et al. | Visual-inertial simultaneous localization and mapping: Dynamically fused point-line feature extraction and engineered robotic applications | |
Koch et al. | Wide-area egomotion estimation from known 3d structure | |
CN112945233B (en) | Global drift-free autonomous robot simultaneous positioning and map construction method | |
Li et al. | Indoor layout estimation by 2d lidar and camera fusion | |
Zuo et al. | Cross-modal semi-dense 6-dof tracking of an event camera in challenging conditions | |
Zhao et al. | A review of visual SLAM for dynamic objects | |
Zhang et al. | 3D reconstruction of weak feature indoor scenes based on hector SLAM and floorplan generation | |
CN115930989A (en) | Visual odometer, vehicle and positioning method | |
Laskar et al. | Robust loop closures for scene reconstruction by combining odometry and visual correspondences | |
Zhang et al. | PLI-VIO: Real-time monocular visual-inertial odometry using point and line interrelated features | |
Biström | Comparative analysis of properties of LiDAR-based point clouds versus camera-based point clouds for 3D reconstruction using SLAM algorithms | |
Zhenghang et al. | The Semantic Point & Line SLAM for Indoor Dynamic Environment | |
CN113850293A (en) | Positioning method based on multi-source data and direction prior joint optimization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |