CN112115980A

CN112115980A - Binocular vision odometer design method based on optical flow tracking and point line feature matching

Info

Publication number: CN112115980A
Application number: CN202010862610.8A
Authority: CN
Inventors: 李慧平; 宋晓; 严卫生
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-08-25
Filing date: 2020-08-25
Publication date: 2020-12-22

Abstract

The invention relates to a binocular vision odometer design method based on optical flow tracking and point line feature matching, and belongs to the technical field of robot positioning and navigation. The method sequentially comprises the following steps: receiving a visual image input by a binocular camera; tracking point features by using a KLT optical flow method, and correlating point feature data; extracting line features and calculating line feature descriptors, and performing line breaking and merging operation on the extracted line features; carrying out data association on point and line characteristics between the previous frame and the next frame in an optical flow tracking matching and descriptor matching mode respectively, solving a reprojection error, and estimating the pose of the current frame; judging whether the current frame is a key frame, if so, extracting point features and calculating descriptors of the point features; if not, setting a reference key frame of the current frame; performing data association on the point and line characteristics among the local key frames in a descriptor matching mode, solving a reprojection error, and optimizing the pose again; and after the local key frame pose is optimized, adjusting the pose of the non-key frame.

Description

Binocular vision odometer design method based on optical flow tracking and point line feature matching

Technical Field

The invention belongs to the technical field of robot positioning and navigation, and particularly relates to a binocular vision odometer design method based on optical flow tracking and point line feature matching.

Background

The Visual Odometer (VO) is an important component of a navigation system, and is widely applied to robotics, such as unmanned vehicles and unmanned aerial vehicles. In these applications, one alternative to VO is to use an Inertial Measurement Unit (IMU), but this solution has the disadvantage of: over time, a large number of errors accumulate because they cannot accurately counteract the effects of gravity. Conventional alternatives also include wheeled odometers, which cannot replace VO, and GPS-based navigation systems, among others, because: it has larger measurement error and poor long-time use precision; the disadvantages of the GPS-based navigation system solution are: these systems are limited to open, unobstructed outdoor environment use and do not allow estimation of the orientation of the devices to which they are connected. The VO technology can largely make up for the defects of the above navigation scheme, and another advantage of VO is that the required information (provided by the camera) can be used for SLAM and scene recognition and other navigation-related tasks.

In the existing VO algorithm, the method based on optical flow tracking can save a large amount of calculation because a descriptor and a matching descriptor are not required to be calculated, has high algorithm speed and better meets the requirement of real-time property. However, the optical flow tracking method has the disadvantages that the optical flow tracking method is easily influenced by illumination, weak texture scenes are not good in effect, and when a camera moves in a large scale, the optical flow tracking method cannot track the scene well; the method based on the point feature has higher robustness compared with the optical flow tracking method when the camera movement scale is too large, and the algorithm precision is higher because the feature points are matched in a descriptor mode, but the point feature method has the defects that: the extraction and matching calculation amount of the point features is large, more time is spent, and the point features cannot work normally in a scene with weak texture; the method based on the line features has higher robustness compared with the point feature method in the weak texture scene, but has the disadvantage that the extraction and matching of the line features are more computationally intensive compared with the point features.

Disclosure of Invention

Technical problem to be solved

In order to solve the problems that the positioning accuracy is not high by adopting an optical flow method and the real-time performance is not strong by adopting a point-line feature method in the existing VO technology, the invention provides a binocular vision odometer design method based on optical flow tracking and point-line feature matching.

Technical scheme

A binocular vision odometer design method based on optical flow tracking and point line feature matching is characterized by comprising the following steps:

step 1: acquiring an image by using a binocular camera, converting the image into a gray image, and then performing enhancement processing on the image by using a self-adaptive histogram equalization algorithm;

step 2: using a KLT optical flow method to track point features of a previous frame image, and establishing data association of the point features between the previous frame and the next frame by combining a bidirectional annular matching strategy:

step 2.1: tracking a left eye image of a previous frame by using a KLT optical flow method and a left eye image of a current frame, and performing tracking matching of point characteristics;

step 2.2: screening and supplementing point characteristics;

step 2.3: performing point feature tracking matching between the left eye image and the right eye image of the current frame by using a KLT optical flow method, and further calculating a three-dimensional coordinate corresponding to a matching point according to an example stereoscopic vision algorithm;

and step 3: extracting line features of the current frame:

step 3.1: extracting line features based on an LSD line feature extraction algorithm, and calculating a line feature descriptor based on an LBD line feature description algorithm;

step 3.2: performing disconnection merging operation on the extracted line characteristics;

step 3.3: performing line feature matching between the left eye image and the right eye image of the current frame;

and 4, step 4: matching line characteristics between the front frame and the rear frame;

and 5: based on the data correlation result of the point characteristics between the previous frame and the next frame, a PNP pose estimation method is adopted to obtain the initial pose estimation of the current frame;

step 6: optimizing and adjusting the pose of the current frame by using a cost function constructed by the reprojection errors of the point characteristics and the line characteristics between the previous frame and the next frame;

and 7: judging whether the current frame is a key frame, if so, executing the step 8; if not, setting the previous key frame as the reference key frame of the current frame, and then jumping to the step 12;

and 8: extracting the point characteristics of the current frame:

step 8.1: extracting and describing point features of the current frame by adopting an ORB algorithm;

step 8.2: performing sparse stereo matching on the point characteristics between the left and right eye images of the current frame in a descriptor matching mode;

and step 9: carrying out data association of point and line characteristics among local key frames;

step 10: performing secondary optimization adjustment on the pose of the local key frames by using a cost function constructed by the re-projection errors of the point features and the line features between the local key frames;

step 11: adjusting the pose of the non-key frame;

step 12: and outputting the poses of all the image frames.

The bidirectional ring matching strategy in the step 2 refers to: the last frame of the left eye image

Feature point set in (1)

Left eye image of current frame by means of bidirectional KLT optical flow method

In-tracking matching to a feature point set

The current frame left eye image

Feature point set in (1)

Right eye image of current frame by bidirectional KLT optical flow method

Matching the intermediate tracking to a temporary feature point set X1; the current frame of the right eye image

The temporary feature point set X1 in the previous frame is a right eye image by means of a bidirectional KLT optical flow method

Matching the intermediate tracking to a temporary feature point set X2; judging whether the characteristic points in the temporary characteristic point set X2 fall into the original characteristic point set or not

The original feature points in the image are deleted from X2 for the feature points falling outside the domain range, and those feature points corresponding to the deleted feature points in X2 are deleted from X1 according to the matching relationship between X1 and X2, thereby obtaining the right image of the current frame

Feature point set in (1)

In the step 4, two line segments l successfully matched₁,l₂The following conditions are satisfied:

(1) the line segments detected by the LSD have directivity, and the included angle of the direction vectors of the two matched line segments is less than phi;

(2) ratio of the lengths of two line segments

(3) Calculating the length l of the two-segment overlapping region_overlap，

(4) The distance corresponding to the LBD eigenvector is less than rho_TAnd is where the distance is smallest.

In step 7, the key frame is determined according to the following principle: the following conditions are also satisfied:

(1) at least 20 image frames are contained between the previous key frame and the previous key frame;

(2) at least 50 point features and 15 line features are successfully tracked;

(3) the co-viewpoint, line feature information with the last key frame is less than 75%.

Advantageous effects

The invention provides a binocular vision odometer design method based on optical flow tracking and point line feature matching, which firstly introduces line features in a point feature-based vision odometer calculation method, namely: the characteristic matching mode of combining the dotted line characteristics is used, so that the robustness and the positioning accuracy of the algorithm are improved, and the algorithm still has good performance even in a weak texture scene; meanwhile, an optical flow method is introduced into a visual mileage calculation method based on point-line features to track the matching point features, so that the rapidity of the algorithm is improved. The method has the advantages of an optical flow tracking method, a point feature method and a line feature method, and overcomes the defects of a single-use optical flow tracking method, a single-use point feature method and a single-use line feature method.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention

FIG. 2 illustrates a two-way circular matching strategy

FIG. 3 is a line feature merge strategy

FIG. 4 is a schematic of a reprojection error of line features

FIG. 5 shows the experimental results on MH _04_ difficult image sequence

FIG. 6 shows a comparison of APE results

Detailed Description

The invention will now be further described with reference to the following examples and drawings:

experiments this algorithm was tested using a sequence of MH _04_ difficult images in the internationally recognized data set EuRoC, with a computer configured to: the CPU is A4-5000, the main frequency is 1.50GHz, the memory is 8GB, and the system is Ubuntu 16.04.

The EuRoC data set is used as a data acquisition platform by an unmanned aerial vehicle carrying sensors such as vision, inertial navigation and radar, wherein the positioning data tracked by the millimeter-scale laser radar is used as a motion true value of the unmanned aerial vehicle. The MH _04_ difficult image sequence was acquired by the drone moving 91.7m at 0.93m/s, 0.24rad/s angular velocity in a dark indoor building, with the camera acquiring data at a frequency of 20 Hz.

FIG. 1 is a flow chart of an embodiment of the present invention. As shown in fig. 1, the method for designing a binocular vision odometer based on optical flow tracking and point line feature matching, provided by the invention, comprises the following steps:

step 1: reading a binocular image, and preprocessing the image;

reading binocular images from an MH _04_ diffcult image sequence, wherein the image preprocessing method comprises the following steps: firstly, converting an image into a gray image, and then performing enhancement processing on the image by using an adaptive histogram equalization algorithm.

Step 2: the method comprises the following steps of tracking point features of a previous frame image by using a KLT optical flow method, and establishing data association of the point features between the previous frame and a next frame by combining a bidirectional annular matching strategy, wherein the method comprises the following steps:

optical flow tracking describes a method by which pixels on an image move between images over time, namely: and determining the position of the pixel point in the previous frame to be about to appear in the next frame. The process of tracking the characteristics of the matching points by the optical flow is the process of tracking the pixel points.

The optical flow treats an image as a function of time, I (t), and then a pixel at time t, located at (x, y), whose gray scale can be written as I (x, y, t). For the same spatial point, in order to determine its upcoming position in the next frame, the optical flow method makes a basic assumption: the gray scale invariant assumption, namely: the pixel gray values of the same spatial point are fixed and invariant in each image.

At time t, the pixel at image coordinate (x, y), which moves to image coordinate (x + dx, y + dy) at time t + dt, according to the assumption of unchanged gray level, there are:

I(x+dx,y+dy,t+dt)＝I(x,y,t) (1)

taylor expansion is performed on the left side of the above equation, and a first order term is retained, which is:

according to the assumption of unchanged gray scale, the gray scale value of the same spatial point at the time t + dt is the same as the gray scale value at the time t, so that:

the left side and the right side of the above formula are divided by dt to obtain:

wherein dx/dt is the moving speed of the pixel point in the x-axis direction, and is recorded as: u; dy/dt is the movement speed of the pixel point in the y-axis direction, and is recorded as: v. At the same time, the user can select the desired position,

the gradient of the pixel point in the x-axis direction is recorded as: i is_x；

The gradient of the pixel point in the y-axis direction is recorded as: i is_y. In addition, note I_tThe matrix form of the variation of the image gray scale to the time is as follows:

after the motion speed u, v of the pixel point between the images is calculated, the position of the pixel point in the next frame image at the next moment can be estimated.

Fig. 2 illustrates a bidirectional ring matching strategy. As shown in FIG. 2, first, the previous frame of the left eye image is divided into two frames

Feature point set in (1)

In-tracking matching to a feature point set

The current frame left eye image

Feature point set in (1)

Right eye image of current frame by bidirectional KLT optical flow method

The original feature points in (2) are deleted from X2 for the feature points falling outside the domain range, and those feature points corresponding to the deletion in X2 are deleted from X1 according to the matching relationship between X1 and X2, thereby obtaining the right-eye image of the current frame

Feature point set in (1)

The bi-directional KLT optical flow method refers to: assuming that point feature matching is performed on the image Picture1 and the image Picture2, point feature tracking is performed from the image Picture1 to the image Picture2 by using the KLT optical flow method, points that fail tracking and points that are tracked to the edge of the image are removed, then the remaining points are again tracked in reverse direction from the image Picture2 to the image Picture1 by using the KLT optical flow method, and the points that fail tracking and the points that are tracked to the edge of the image are removed again, and then the remaining points are used as points matched between the image Picture1 and the image Picture 2.

Step 2.2: point features were screened and supplemented.

In step 2.1, an optical flow tracking method is used, the left eye image of the current frame tracks the left eye image of the previous frame, point feature tracking is carried out, the number of the point features successfully tracked is judged, the point features in the left eye image of the current frame are extracted through a Shi-Tomasi corner point detection method, the point features falling into the field of the set range of the original point features are deleted, newly added point features are obtained, and the point features in the left eye image of the current frame are supplemented to 200.

Step 2.3: and (3) performing point feature tracking matching between the left eye image and the right eye image of the current frame by using a KLT optical flow method, and further calculating a three-dimensional coordinate corresponding to the matching point according to an example stereoscopic vision algorithm. And calculating the three-dimensional coordinates corresponding to the matching points according to a stereoscopic vision algorithm, wherein the calculation process is as follows:

d＝u_L-u_R (6)

wherein (X, Y, Z) is three-dimensional coordinates of a space point in a camera coordinate system, f represents the focal length of the camera, b represents a base line of the binocular camera, and c_x,c_yIs the internal reference of the camera, u_LIs the pixel coordinate, u, of a spatial point in the left eye image in the x-axis direction_RPixel coordinates, v, of spatial points in the direction of the x-axis in the right eye image_LThe y-axis direction pixel coordinate of the space point in the left eye image, d is the abscissa difference of the space point in the left eye image and the right eye image, namely: parallax error.

And step 3: extracting line characteristics of a current frame, comprising the following steps:

FIG. 3 is a line feature merging strategy. As shown in fig. 3, the line feature merging mainly includes the following sub-steps:

step 3.2.1: line feature set L ═ L_i}_i＝1,...,nSorting according to length and selecting the longest line segment L_max；

Step 3.2.2: at L_maxThe principal direction difference dist (ang) between the L and the member of the L is obtained_i,ang_j) The distance d between the dotted line and the endpoint, the distance L between the endpoint and the descriptor distance t, and the distance L from the set L is found out according to the set threshold value_maxThe approximate line segments are compared to form a candidate line segment group L'.

Step 3.2.3: members of L' with L_maxSegment merging is performed and the descriptors are recalculated to obtain a merged segment group L ".

Step 3.2.4: calculate L ″)_iThe length s of the line, and L ″)_iAnd L_maxThe difference in main direction between them, and whether it is less than the set threshold value. Find and L_maxTo-be-merged line producing minimum difference in principal directionSegment L'_iAnd with merge line segment L ″)_iTo replace L'_iAnd L_max。

Wherein des ═ { des ═ des_i}_i＝1,...,nIs line feature set L ═ L_i}_i＝1,...,nCorresponding descriptor subset, ang_THIs a main direction difference threshold, d_THIs the dot-line distance threshold,/_THAs end point distance threshold, des_THTo describe the sub-distance threshold, s_THIs a segment length threshold.

Step 3.3: and performing line feature matching between the left eye image and the right eye image of the current frame.

And 4, step 4: and matching line characteristics between the front image frame and the rear image frame.

Two line segments l successfully matched₁,l₂The following conditions are satisfied:

the segment for LSD detection has directivity, and the included angle of the direction vectors of two matched segments is smaller than phi;

the ratio of the lengths of the two line segments

Calculating the length l of the overlapping area of the two sections_overlap，

The distance between the fourth step and the LBD characteristic vector is smaller than rho_TAnd is where the distance is smallest.

the constraint relation between the point characteristics and the pose is as follows:

P_c＝[P_x P_y P_z]^T＝R_cwP_w+t_cw (8)

where Φ represents the projection model of the pinhole camera.

The reprojection error of the spatial point features is defined as follows:

in the above formula, (x)_i,j,y_i,j) And the pixel coordinates of the j-th point feature on the i-th frame image are represented.

FIG. 4 is a schematic of the reprojection error of a line feature, where s_w,k,

Is two 3D end points of the feature of the kth line observed by the ith frame image under a world coordinate system,

is to mix s_w,k,e_w,kTwo 2D endpoints, s, re-projected onto the ith frame image_i,k,

Is s_w,k,e_w,kAt the corresponding two endpoints in the ith frame image,^hs_i,k,

are respectively s_i,k,e_i,kCorresponding homogeneous coordinate point, d_s,d_eRespectively representing two end points

Each to the re-projected line segment

The distance of (c).

Pi is formed by two homogeneous coordinate points^hs_i,k,^he_i,kAnd optical center c of camera_iDetermined plane by_i,kUnit normal vector representing plane pi:

the reprojection error of the spatial line features is defined as follows:

in addition, the first and second substrates are,^hΦ is a homogeneous coordinate form of Φ, i.e.:

under the condition that the observation errors are assumed to be in Gaussian distribution, a cost function F integrating the reprojection errors of the point-line characteristics can be constructed:

wherein,

information matrices of the reprojection errors of the point features, line features, respectively, H_P,H_lHuber robust kernel function, rho, of point and line features, respectively_l,Ι_lRespectively representing a point feature set and a line feature set.

Since the minimum is the quadratic sum of two norms of error terms, the growing speed is the square of the error, and if mismatching occurs, the system is optimized to the wrong value, so that the Huber kernel function is introduced to reduce the influence of the mismatching.

The specific form of the Huber kernel function is as follows:

the error is the sum of the reprojection errors of the point feature and the line feature, and when the value of the error is larger than a threshold value, the increasing speed of the cost function is converted from a quadratic function to a linear function, so that the influence of mismatching is reduced.

The process of minimizing the cost function F is the process of solving the state quantities to be estimated. In order to optimize the cost function by using the nonlinear optimization method, a Jacobian matrix of the error function with respect to the state variable needs to be calculated first.

When the camera generates a small pose change xi belongs to SE (3) (xi is a transformation matrix T_iwCorresponding lie algebra), remember g_PIs P_w,jCorresponding coordinates in the camera coordinate system, g_s,g_eAre respectively s_w,k,e_w,kCorresponding coordinates under the camera coordinate system:

g_P＝error^ξP_w,j＝R_iwP_w,j+t_iw (15)

g_s＝error^ξs_w,k＝R_iws_w,k+t_iw (16)

g_e＝error^ξe_w,k＝R_iwe_w,k+t_iw (17)

error of point feature reprojection

The Jacobian matrix for the small pose change ξ ∈ SE (3) of the camera is as follows:

wherein,

line feature reprojection error

wherein,

in the above process, the]_∧Representing conversion of an antisymmetric matrix by the following formula

And after calculating the Jacobian matrix, solving the optimization problem by adopting a Gauss-Newton nonlinear optimization method.

the key frame selection refers to removing redundant image frames and reserving representative image frames. When the number of the selected key frames is too large, information among the key frames is redundant, so that a large amount of calculation is brought, and the requirement on real-time performance is difficult to meet; when the number of the selected key frames is too small, the information correlation between the key frames is poor, and the camera pose is difficult to estimate correctly, so that the positioning fails. The key frame selected by the invention simultaneously meets the following conditions:

the method comprises the steps of firstly, at least 20 image frames are contained between a key frame and a previous key frame;

at least 50 point features and 15 line features are successfully tracked;

and c, the common viewpoint and line characteristic information between the three key frames is less than 75%.

According to the judgment condition of the key frame, if the current frame is not the key frame, setting the previous key frame as the reference key frame of the current frame, calculating the Relative pose relationship between the two frames, and recording the Relative pose relationship.

The pose of the current frame CurrentImage is T_CBelongs to SE (3), and the pose of the reference key frame RefermImage is T_RE, SE (3), the Relative pose quantity Relative is: t is_Rel＝T_C(T_R)^-1。

And 8: extracting the point characteristics of the current frame, comprising the following steps:

and establishing a pyramid for the current frame image, respectively carrying out image blocking on the pyramid image to obtain an image area with a certain size, and extracting and describing point characteristics in each block by utilizing an ORB characteristic point extraction algorithm.

Step 8.2: and carrying out sparse stereo matching on the point characteristics between the left and right eye images of the current frame in a descriptor matching mode.

And matching point characteristics between the left eye image and the right eye image according to the distance between the descriptors, and solving the 3D coordinates of the corresponding space points according to the characteristic point pairs obtained by binocular matching.

the local key frame herein refers to 11 key frames composed of the current frame and the previous 10 key frames. The point features between local key frames are data-related by means of descriptor matching, and it should be particularly noted that the feature points between the previous and next frames in step 2 are data-related by tracking the point features by using the KLT optical flow method.

wherein,

information matrices of the reprojection errors of the point features, line features, respectively, H_P,H_lHuber robust kernel function, rho, of point and line features, respectively_l,Ι_l,κ_lRespectively representing a point feature set, a line feature set and a local key frame.

In step 6, a cost function is constructed according to the data association relation of the point and line characteristics between the previous frame and the next frame, so that the pose of the current frame is optimized and solved; in step 10, a new cost function is constructed again according to the data association relationship of the point and line characteristics between the local key frames, and the pose of the local key frames is optimized and adjusted by adopting a Gauss-Newton nonlinear optimization method.

Step 11: adjusting the pose of the non-key frame;

for each non-key frame, a corresponding reference key frame is set for the non-key frame, and the relative pose quantity between the non-key frame and the reference key frame is solved. In step 10, the pose of the local key frame is adjusted, so that the pose T of the local key frame RefermImage is referred to_RIs optimally adjusted to T_RAt' time, the poses of those non-key frames are also adjusted according to the Relative pose quantity Relative, that is: t is_CIs adjusted toIs T_C', wherein, T_C′＝T_RelT_R'. By setting the reference key frame, the pose of the non-key frame is optimized and adjusted again.

Step 12: and outputting the pose of the image frame.

And (5) processing the MH _04_ difficult image sequence in a circulating manner from the step 1 to the step 12, calculating the corresponding pose of each image frame, outputting a pose result and recording the pose result in a txt document form.

And then, evaluating and analyzing the performance of the algorithm according to the pose result of the image frame recorded in the txt document.

Fig. 5 shows a comparison effect graph of the experimental result of the present algorithm on the MH _04_ differential image sequence and the real track, wherein the experimental result shows that: the algorithm can complete the positioning task on the MH _04_ differential image sequence.

In the process of positioning accuracy analysis, Absolute Pose Error (APE) is adopted as an evaluation index to evaluate the positioning accuracy of the algorithm, the APE is obtained by calculating the distance between an estimated Pose and a real Pose, and the real Pose is T at the moment i_iBelongs to SE (3), and estimates the pose as T_i'. epsilon SE (3), APE is: APE_i＝T_i(T′_i)^-1。

The positioning accuracy of the algorithm and the ORB-SLAM2 algorithm without the loop detection link on the EuRoC MH-04 _ diffucult image sequence is compared.

FIG. 6-1 is a comparison graph of the APE of the algorithm and the APE of the ORB-SLAM2 algorithm without loop detection, and for better comparison of the positioning accuracy, the positioning accuracy information in FIG. 6-2 is summarized as follows:

as can be seen from the above table, when an MH _04_ differential image sequence is tested, the maximum positioning error of the algorithm is 0.268466m, which is reduced by 0.046196m compared with the ORB-SLAM2 algorithm, the minimum positioning error of the algorithm is 0.011937, which is reduced by 0.000562m compared with the ORB-SLAM2 algorithm, the root mean square error of the algorithm is 0.096488m, and the positioning accuracy is improved by 28.8% compared with the ORB-SLAM2 algorithm.

The average processing time of the algorithm for each image frame is 42.837ms, about 23.3 frames/second, and compared with the data acquisition frequency of the camera at 20Hz, the algorithm can meet the real-time requirement.

Claims

1. A binocular vision odometer design method based on optical flow tracking and point line feature matching is characterized by comprising the following steps:

step 2.2: screening and supplementing point characteristics;

and step 3: extracting line features of the current frame:

and 8: extracting the point characteristics of the current frame:

step 11: adjusting the pose of the non-key frame;

step 12: and outputting the poses of all the image frames.

2. The binocular vision odometer design method based on optical flow tracking and point line feature matching as claimed in claim 1, wherein: the bidirectional ring matching strategy in the step 2 refers to: the last frame of the left eye image

Feature point set in (1)

In-tracking matching to a feature point set

The current frame left eye image

Feature point set in (1)

Right eye image of current frame by bidirectional KLT optical flow method

Feature point set in (1)

3. The binocular vision mileage based on optical flow tracking and point line feature matching as claimed in claim 1The design method is characterized by comprising the following steps: in the step 4, two line segments l successfully matched₁,l₂The following conditions are satisfied:

(2) ratio of the lengths of two line segments

(3) Calculating the length l of the two-segment overlapping region_overlap，

4. The binocular vision odometer design method based on optical flow tracking and point line feature matching as claimed in claim 1, wherein: in step 7, the key frame is determined according to the following principle: the following conditions are also satisfied:

(2) at least 50 point features and 15 line features are successfully tracked;