Nothing Special   »   [go: up one dir, main page]

CN117808839A - Image frame alignment method and device and electronic equipment - Google Patents

Image frame alignment method and device and electronic equipment Download PDF

Info

Publication number
CN117808839A
CN117808839A CN202311575413.8A CN202311575413A CN117808839A CN 117808839 A CN117808839 A CN 117808839A CN 202311575413 A CN202311575413 A CN 202311575413A CN 117808839 A CN117808839 A CN 117808839A
Authority
CN
China
Prior art keywords
frame image
calculating
depth
image
grid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311575413.8A
Other languages
Chinese (zh)
Inventor
刘永劼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aixin Yuanzhi Semiconductor Ningbo Co ltd
Original Assignee
Aixin Yuanzhi Semiconductor Ningbo Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aixin Yuanzhi Semiconductor Ningbo Co ltd filed Critical Aixin Yuanzhi Semiconductor Ningbo Co ltd
Priority to CN202311575413.8A priority Critical patent/CN117808839A/en
Publication of CN117808839A publication Critical patent/CN117808839A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image frame alignment method, an image frame alignment device and electronic equipment, wherein the method comprises the following steps: acquiring a matching point pair of a current frame image and a previous frame image, calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters; calculating the vertex depth of the vertexes in the grid according to the point-to-depth, and calculating the grid offset value of the grid on the 2D image according to the vertex depth, the rotation parameter and the translation parameter; performing bilinear interpolation up-sampling on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels; and aligning the current frame image and the previous frame image according to the target offset value. According to the method, the rotation relation and the translation relation of the front frame image and the rear frame image are calculated through the inertial sensor, the inertial sensor is irrelevant to light and shade of light, alignment between image frames can be achieved even when the light is darker, and accuracy of an alignment result between the image frames is improved.

Description

Image frame alignment method and device and electronic equipment
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image frame alignment method and apparatus, and an electronic device.
Background
The vehicle-mounted camera generally has the function of recording video and audio in the driving process, and can provide reliable judgment basis for analysis and processing of traffic accidents. The vehicle-mounted camera has a certain requirement on imaging definition, and in a night scene, noise reduction processing is required to be performed in order to obtain higher imaging definition because of insufficient light entering quantity in the camera and more noise during imaging. The noise reduction mode can be divided into 2D noise reduction and 3D noise reduction, the 2D noise reduction only carries out noise reduction processing on a two-dimensional space domain, the 3D noise reduction adds time domain processing on the basis of 2D noise reduction, and the time domain relation between frames is considered, so that the 3D noise reduction belongs to three-dimensional noise reduction. In an extremely dark scene, if only 2D noise reduction is used, it is difficult to obtain an ideal noise reduction effect, and in order to improve the noise reduction effect, a 3D noise reduction mode may be used. Since there may be motion in the photographed scene, a motion alignment operation between images is required when 3D noise reduction is performed.
In one possible approach, the motion relationship between image frames may be obtained by calculating matching points between image frames. However, in extremely dark scenes, the accuracy of the direct image-based algorithm is difficult to be ensured. In another possible manner, in addition to the vehicle-mounted camera, some sensors, such as inertial sensors (Inertial Measurement Unit, IMU), or lidar, may be mounted in the vehicle-mounted system. The motion relationship between image frames can also be obtained by using an IMU or a lidar, etc. For example, the rotation angle between the image frames can be obtained by integrating the angular velocity of the IMU between the image frames, and the positional relationship between the camera and the IMU in the vehicle-mounted camera can be calibrated, so that the rotation transformation of the image between the frames can be obtained through the IMU, and further the homography transformation (Homography matrix, H) between the image frames can be obtained through the rotation transformation, thereby obtaining the alignment between the image frames.
However, the above processing method does not consider the translation factor, but the vehicle must translate during running, so that the calculated alignment result between the image frames is inaccurate. In summary, there is a problem that the alignment result between image frames calculated by the image-based algorithm directly or the alignment result between image frames calculated only by the IMU is inaccurate.
Disclosure of Invention
The application provides an image frame alignment method, an image frame alignment device and electronic equipment, and aims to solve the problem that an alignment result between image frames calculated by an algorithm based on images directly or an alignment result between image frames calculated by an IMU only is inaccurate.
In a first aspect, some embodiments of the present application provide an image frame alignment method, including:
acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image;
calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters;
calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image;
calculating a grid offset value of the grid on a 2D image according to the vertex depth, the rotation parameter and the translation parameter;
performing interpolation calculation on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels;
and aligning the current frame image and the previous frame image according to the target offset value.
In some possible embodiments, the step of calculating, by an inertial sensor, rotation parameters and translation parameters of the current frame image and the previous frame image comprises:
calculating the angular velocity of the inertial sensor according to the equipment parameters, the equipment offset variables and the Gaussian noise of the inertial sensor;
calculating linear acceleration of the inertial sensor according to the rotation variable, the equipment offset variable and the white noise of the inertial sensor;
acquiring a dynamics model of the inertial sensor through the angular velocity, and acquiring a discrete model of the dynamics model through Euler integration;
and inputting a formula for calculating the linear acceleration into the discrete model to obtain the rotation parameter and the translation parameter.
In some possible embodiments, the method further comprises:
calculating a pre-integral of the inertial sensor from the white noise, gaussian noise, the rotation parameter and the translation parameter;
calculating absolute coordinate values of the matching points based on the pre-integral and a GPS (global positioning system) locator;
and calculating an absolute rotation parameter and an absolute translation parameter according to the absolute coordinate values.
In some possible embodiments, the method further comprises:
comparing the pixel change information of the matching point pairs in the current frame image and the previous frame image to determine sparse points and motion information of pixels in the current frame image;
determining grid points in the current frame image according to the sparse points and the motion information;
and constructing grids of the current frame image according to the grid points.
In some possible embodiments, the step of calculating the point pair depth of the matching point pair from the matching point pair, the rotation parameter, and the translation parameter includes:
calculating a rotation residual error, a speed residual error and a translation residual error of the pose of the GPS positioner relative to the inertial sensor;
generating a cost function according to the rotation residual error, the speed residual error and the translation residual error;
calculating a minimum value of the cost function and a target rotation parameter and a target translation parameter corresponding to the minimum value by using gradient recursive method;
and calculating the point pair depth of the matched point pair according to the target rotation parameter and the target translation parameter.
In some possible embodiments, the step of computing the vertex depths of vertices in the mesh from the point-to-depth comprises:
identifying peripheral point pairs within a preset range of the matching point pairs;
determining vertexes in the grid according to the peripheral point pairs;
acquiring the number of the peripheral point pairs and the depth of the peripheral point pairs;
calculating the average depth of the vertexes according to the point pair quantity and the peripheral point pair depth;
the average depth is marked as the vertex depth of the vertex.
In some possible embodiments, the step of calculating a mesh offset value of the mesh on the 2D image from the vertex depth, the rotation parameter, and the translation parameter comprises:
acquiring pixel coordinates of the current frame image and the previous frame image;
calculating grid coordinates in the current frame image and the previous frame image according to the vertex depth, the pixel coordinates, the rotation parameters and the displacement parameters;
and calculating grid offset values of the grids on the 2D image according to the grid coordinates.
In some possible embodiments, the step of performing interpolation on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels includes:
acquiring the grid offset value;
and performing bilinear interpolation up-sampling on the offset value to obtain a target offset value based on the pixel.
In a second aspect, some embodiments of the present application provide an alignment apparatus for an image frame, including an acquisition module, a point-to-depth calculation module, a vertex depth calculation module, an offset value module, a linear interpolation module, and an alignment module;
the acquisition module is used for: acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image;
the point-to-depth calculation module is used for: calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters;
the vertex depth calculation module is used for: calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image;
the offset value module is used for: calculating a grid offset value of the grid on a 2D image according to the vertex depth, the rotation parameter and the translation parameter;
the linear interpolation module is used for: performing interpolation calculation on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels;
the alignment module is used for: and aligning the current frame image and the previous frame image according to the target offset value.
In a third aspect, some embodiments of the present application provide an electronic device comprising a processor and a memory communicatively coupled to the processor, wherein the memory stores instructions executable by the processor to cause the processor to perform the method of aligning image frames of the first aspect.
As can be seen from the above technical content, the present application provides an image frame alignment method, an image frame alignment device, and an electronic device, where the method includes: acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image; calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters; calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image; calculating a grid offset value of the grid on the 2D image according to the vertex depth, the rotation parameter and the translation parameter; performing bilinear interpolation up-sampling on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels; and aligning the current frame image and the previous frame image according to the target offset value. According to the method, the rotation relation and the translation relation of the front frame image and the rear frame image can be calculated through the inertial sensor, the inertial sensor is irrelevant to the brightness of light rays and the brightness of the environment, the alignment between the image frames can be realized when the light rays are darker, and the accuracy of the alignment result between the image frames is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is a flowchart illustrating an image frame alignment method according to some embodiments of the present disclosure;
FIG. 2 is a flow chart of calculating rotation parameters and translation parameters provided in some embodiments of the present application;
fig. 3 is a schematic view of a scenario for generating grids and grid points provided in some embodiments of the present application;
fig. 4 is a schematic view of effects before a current frame image is aligned with a previous frame image according to some embodiments of the present application;
fig. 5 is a schematic diagram of an effect of a current frame image aligned with a previous frame image according to some embodiments of the present application.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the present application. Merely as examples of systems and methods consistent with some aspects of the present application as detailed in the claims.
In order to view the vehicle interior conditions and the vehicle running conditions, an in-vehicle camera may be installed in the vehicle. The vehicle-mounted camera generally has the function of recording video and audio in the driving process, and can provide reliable judgment basis for analysis and processing of traffic accidents. The vehicle-mounted camera has a certain requirement on imaging definition, and in a night scene, noise reduction processing is required to be performed in order to obtain higher imaging definition because of insufficient light entering quantity in the camera and more noise during imaging. For example, noise reduction may be achieved by a noise reduction module.
The noise reduction method can be classified into 2D noise reduction and 3D noise reduction, and if noise reduction processing is performed using only the same-frame image information, i.e., noise reduction of a single-frame image, referred to as 2D noise reduction, the 2D noise reduction performs noise reduction processing only in a two-dimensional spatial domain. If the noise reduction processing is performed using the front and rear frame image information, that is, noise reduction is performed by the front and rear frame images in the time domain as compared to a single frame image, then 3D noise reduction is called as 3D noise reduction, and the 3D noise reduction adds the time domain processing on the basis of 2D noise reduction, taking into account the time domain relationship between frames, so that the 3D noise reduction belongs to three-dimensional noise reduction.
In an extremely dark scene, if only 2D noise reduction is used, it is difficult to obtain an ideal noise reduction effect. Thus, to enhance the noise reduction effect, in some embodiments, a 3D noise reduction approach may be used. However, since there may be motion in the photographed scene, a motion alignment operation between images is required when 3D noise reduction is performed.
In one possible approach, the motion relationship between image frames may be obtained by calculating matching points between image frames. For example, a feature point matching algorithm, an optical flow algorithm, or the like may be utilized. However, in extremely dark scenes, the accuracy of the direct image-based algorithm is difficult to be ensured. In another possible manner, in addition to the vehicle-mounted camera, some sensors, such as inertial sensors (Inertial Measurement Unit, IMU), or lidar, may be mounted in the vehicle-mounted system. The motion relationship between image frames can also be obtained by using an IMU or a lidar, etc. For example, the rotation angle between the image frames can be obtained by integrating the angular velocity of the IMU between the image frames, and the positional relationship between the camera and the IMU in the vehicle-mounted camera can be calibrated, so that the rotation transformation of the image between the frames can be obtained through the IMU, and further the homography transformation (Homography matrix, H) between the image frames can be obtained through the rotation transformation, thereby obtaining the alignment between the image frames.
However, the above processing method does not consider the translation factor, but the vehicle must translate during running, so that the calculated alignment result between the image frames is inaccurate. In summary, there is a problem that the alignment result between image frames calculated by the image-based algorithm directly or the alignment result between image frames calculated only by the IMU is inaccurate.
In order to solve the problem that the alignment result between the image frames is inaccurate, some embodiments of the present application provide an alignment method for an image frame, where the method can align a current frame image and a previous frame image in a vehicle-mounted scene, and further is used for implementing 3D noise reduction. Compared with the alignment method of visual modes such as images shot by a camera, the method can calculate the rotation relation and the translation relation of the front frame image and the rear frame image through the inertial sensor, for example, calculate the rotation parameter and the translation parameter between the front frame image and the rear frame image, and the inertial sensor is irrelevant to the brightness of light rays and the brightness of the environment, so that the alignment between the image frames can be realized even when the light rays are darker, and the accuracy of the alignment result between the image frames is improved.
In order to facilitate understanding of the technical solutions in some embodiments of the present application, the following details of each step are described with reference to some specific embodiments and the accompanying drawings. Fig. 1 is a flowchart of an image frame alignment method provided in some embodiments of the present application, as shown in fig. 1, in some embodiments, the image frame alignment method may include the following steps S1 to S6, which are specifically described as follows:
step S1: and acquiring a matching point pair of the current frame image and the previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image.
In the in-vehicle scene, since there may be motion in the subject scene, there is a possibility that the content corresponding to the same pixel position photographed in the previous frame image and the current frame image may be different. For convenience of description, the point pairs corresponding to the same pixel positions may be referred to as matching point pairs, in some embodiments, the matching point pairs of the current frame image and the previous frame image may be obtained by capturing an image by using a vehicle-mounted camera, for example, an algorithm based on feature point matching may be adopted, or tracking between two frames by using sparse optical flow, so that the matching point pairs may be obtained, and a data basis is provided for subsequent calculation. After the completion of the execution of step S1, the following step S2 may be executed.
Step S2: and calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters.
Because there may be motion between the vehicle camera and the subject, there may be rotation and/or translation of the current and previous frame images captured. In some embodiments, the IMU is a sensor for detecting and measuring acceleration and rotational motion, rotation and translation parameters of the current and previous frame images may be calculated by the IMU, and a point pair depth of the matching point pair may be calculated from the matching point pair, the rotation and translation parameters. That is, in the embodiment of the present application, the image can be fused with the inertial sensor, which is independent of the illumination brightness or the like, is not affected by the brightness, and the operations such as noise reduction and alignment can be accurately performed even in a dark scene.
Fig. 2 is a schematic flow chart of calculating rotation parameters and translation parameters according to some embodiments of the present application, as shown in fig. 2, when calculating rotation relations and translation relations, such as rotation parameters and translation parameters, of a current frame image and a previous frame image by using an inertial sensor, the calculation may be implemented by first calculating an angular velocity of the inertial sensor according to a device parameter, a device offset variable, and gaussian noise of the inertial sensor, then calculating a linear acceleration of the inertial sensor according to a rotation variable, a device offset variable, and white noise of the inertial sensor, then obtaining a dynamics model of the inertial sensor through the angular velocity, obtaining a discrete model of the dynamics model through euler integration, and finally inputting a formula for calculating the linear acceleration into the discrete model to obtain the rotation parameters and the translation parameters.
By way of example, the process of calculating rotation parameters and translation parameters by the IMU will be described below in connection with a specific derivation process. It should be noted that the following formulas are only exemplary, and not limiting the application, and may be implemented in other ways. Taking the rotation parameter as R and the translation parameter as t as an example, the parameter p appears in the following formula, and p is also the translation parameter and may be the same as t. First, the manner in which the IMU angular velocity is calculated is described by equation one:
in the above-mentioned formula one,i.e. the angular velocity of the IMU, the corner marks w and b represent the device parameters, respectively, e.g. the external parameters and the own parameters, respectively, the model being assumed to be static, b g For device offset variable, η g Representing gaussian noise that needs to be processed during the process.
The IMU linear acceleration is calculated by referring to the following equation:
wherein,representing the rotational variables of the IMU itself, g w Represents the acceleration of gravity, b a Also the device offset variable, η a Is white noise.
The kinetic model of the IMU can be expressed as the following equation three:
through Euler integration, a discrete form of the dynamic model can be obtained, and the discrete model is obtained by referring to the following formula IV:
v w (t+Δt)=v w (t)+a w (t)·Δt;
wherein R may represent a rotation parameter, V may represent a speed, P is a translation parameter, and P may be the same as the translation parameter t. The information and calculation methods related to the speed and the position in the formula can refer to the technical methods used in the field, and are not described herein. Through the above formula four, a pre-integration mode of the IMU has been established, and in order to understand the derivation process of the rotation parameter and the translation parameter more clearly, in some embodiments, the following formula five may be used to perform formula conversion:
then, the formula two can be combined with the formula four to obtain the following formula six:
in the above formula, η may be used gd And eta ad To replace eta g And eta a . As can be seen by combining the first and sixth formulas, a derivation of the rotation parameter R, the translation parameter t, or p can be obtained.
After the rotation parameter and the translation parameter are obtained, the point pair depth of the matching point pair can be calculated according to the matching point pair, the rotation parameter and the translation parameter. The following procedure may be performed before calculating the point pair depth of the matching point pair. Firstly, pixel change information of a matching point pair in a current frame image and a previous frame image can be compared to determine sparse points and motion information of pixels in the current frame image, then grid points in the current frame image can be determined according to the sparse points and the motion information, and then grids of the current frame image are constructed according to the grid points.
For example, fig. 3 is a schematic view of a scene of generating a grid and grid points according to some embodiments of the present application, as shown in fig. 3, by comparing pixel change information of a matching point pair in a current frame image and a previous frame image, motion information such as a motion direction of a pixel point in the current frame image may be obtained, some sparse points may be obtained, a pixel position where a grid point is located may be calculated by calculating an average value of pixels of points around the sparse points, and after the grid point is determined, the grid point may be connected to form the grid.
Since the above-described calculation of the rotation parameter and the translation parameter uses the parameter such as the velocity V, the above-described manner of calculating the rotation parameter and the translation parameter may be referred to as a relative manner. In some embodiments, the rotation and translation parameters may also be calculated in an absolute manner. In the specific implementation, the pre-integration of the inertial sensor can be calculated through white noise, gaussian noise, rotation parameters and translation parameters, then the absolute coordinate value of the matching point can be calculated based on the pre-integration, and the absolute rotation parameters and the absolute translation parameters can be calculated according to the absolute coordinate value.
The above-mentioned references may be exemplified by eta gd And eta ad To replace eta g And eta a The covariance between the discrete noise and the continuous noise has the following relationship:
assuming Δt is fixed, equation six can yield equation seven in the form of:
wherein k is a camera internal reference in the vehicle-mounted camera, the seventh formula is pre-integration of the inertial sensor calculated through white noise, gaussian noise, rotation parameters and translation parameters, and then absolute rotation parameters and absolute translation parameters can be calculated by taking the pre-integration as constraint conditions and combining a GPS (global positioning system) positioner. For example, based on equation seven, equation eight may be derived as follows:
and after the absolute coordinate value calculation is completed, the absolute rotation parameter and the absolute translation parameter can be obtained by subtracting the absolute coordinate values. The specific calculation mode can be combined with the actual use scenario to select a corresponding formula, which is not specifically limited in the application.
After the rotation parameter and the translation parameter are calculated, the point pair depth of the matching point pair can be calculated according to the matching point pair, the rotation parameter and the translation parameter. It should be noted that, in order to improve the accuracy of calculating the depth of the point, an absolute rotation parameter and a translation parameter may be selected, and the rotation may be performed according to the actual scene and the actual requirement.
In calculating the point pair depth of the matching point pair, this can be achieved as follows. Firstly, a rotation residual error, a speed residual error and a translation residual error of the pose of the GPS positioner relative to the inertial sensor can be calculated, then a cost function is generated according to the rotation residual error, the speed residual error and the translation residual error, then a minimum value of the cost function and a target rotation parameter and a target translation parameter corresponding to the minimum value are calculated by gradient progressive subtraction, and finally, the point pair depth of a matching point pair can be calculated according to the target rotation parameter and the target translation parameter.
By way of example, the rotation residual, the speed residual, and the translation residual may be expressed as the following formula nine:
wherein,namely rotation residual error, ">I.e. speed residual,/->Namely, the translation residual is based on the above formula nine, the following cost function can be generated:
wherein x may each representRotation residual, < >>Speed residual sum->And (3) translating residual errors, namely calculating the minimum value of the cost function by gradient gradual subtraction, for example, firstly searching int x, then for iteration of the camera internal parameter K, solving Δt, enabling the I f (x+delta x) I to reach the minimum value, if delta x is small enough, interrupting, otherwise, continuing to calculate until the minimum value of the cost function is calculated, then determining a target rotation parameter and a target translation parameter corresponding to the minimum value, and then calculating the point pair depth of the matched point pair.
Exemplary, in s1 ands2 represents the depth of the point pair matching the point pair, assuming that the point P in 3D is p= [ X, Y, Z] T S1 and S2 can be calculated by the following formula ten:
s 1 p 1 =KP;
s 2 p 2 =K(RP+t);
wherein, p in lower case is the pixel point to be packed, s is the depth in the 3D noise reduction scene. And obtaining the point pair depth of the matched point pair, namely the depth of each sparse point in the current frame image. After the completion of the execution of step S2, the following step S3 may be executed.
Step S3: vertex depths for vertices in the mesh are calculated from the point-to-depth.
In some embodiments, the grid is obtained by calculating the motion trajectories of pixels in the current frame image and the previous frame image. After the depth of each sparse point is calculated in step S2, when the vertex depth of the vertices in the mesh is calculated, this can be achieved as follows.
First, the peripheral point pairs in the preset range of the matching point pairs can be identified, the preset range can be determined according to the actual use scene, then, the vertexes in the grid can be determined according to the peripheral point pairs, and it can be understood that each sparse point is not necessarily the vertex position of the grid. And obtaining the number of the peripheral point pairs and the depth of the peripheral point pairs, and calculating the average depth of the vertexes according to the number of the point pairs and the depth of the peripheral point pairs, wherein the average depth of the vertexes is the vertex depth of the vertexes. After the completion of the execution of step S3, the following step S4 may be executed.
Step S4: grid offset values of the grid on the 2D image are calculated from the vertex depth, rotation parameters and translation parameters.
For the final representation of the offset of each pixel between the current frame image and the previous frame image, the offset value of the uniform grid position in the 2D image can be obtained first, and then bilinear interpolation up-sampling is carried out on the grid offset to obtain the offset of each pixel between the two frames.
When calculating the grid offset value of the grid on the 2D image, firstly, the pixel coordinates of the current frame image and the previous frame image can be obtained, the pixel coordinates are known, then the grid coordinates in the current frame image and the previous frame image are calculated according to the vertex depth, the known pixel coordinates, the rotation parameters and the displacement parameters, and after the grid coordinates are calculated, the grid offset value of the grid on the 2D image is calculated according to the grid coordinates.
By way of example, given that the vertex depths s1 and s2 of two-frame images are known, where depth s1 corresponds to t frames and depth s2 corresponds to t+1 frames, given that the rotation parameter R and translation parameter t between the two frames are known, assuming that the t frames are to be aligned to t+1 frames, the amount to be required is the 2D coordinates of the uniform grid point 2D coordinates on the t frames on the t+1 frames, where p1 is the known pixel coordinates on the t frames. By combining the formula ten, s1, P1 and K can be brought into the upper half of the formula ten to obtain P, and then brought into the lower half of the formula to obtain P2, and then ||p2-p1|| is the grid offset value of the grid on the 2D image. After the completion of the execution of step S4, the following step S5 may be executed.
Step S5: and performing bilinear interpolation up-sampling on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels.
Since the image alignment operation is to align each pixel, the offset of the grid needs to be up-sampled to obtain the offset of each pixel point. After calculating the offset value of the grid position in step S4, the grid offset value may be acquired, and then bilinear interpolation up-sampling may be performed on the offset value, to obtain a target offset value based on pixels.
In some embodiments, other interpolation methods may be used, and the selection of a specific method may be combined with the actual usage scenario and the actual requirement, which is not specifically limited in this application. After the completion of the execution of step S5, the following step S6 may be executed.
Step S6: and aligning the current frame image and the previous frame image according to the target offset value.
After the target offset value is determined, the current frame image and the previous frame image can be aligned according to the target offset value. For example, an alignment module may be provided, by which two frames of images may be aligned based on the target offset value.
Fig. 4 is a schematic view of an effect of a current frame image before alignment with a previous frame image according to some embodiments of the present application, and fig. 5 is a schematic view of an effect of a current frame image after alignment with a previous frame image according to some embodiments of the present application, with reference to fig. 4 and fig. 5, there is a significant offset between two aligned frame images, and the two aligned frame images have no offset, so after alignment, alignment of all pixels in the two front and rear frame images can be achieved, that is, alignment of contents of the two front and rear frame images can be achieved, and noise reduction processing is performed after the contents of the two frame images are the same.
The embodiment of the application also provides an alignment device of the image frames, which comprises an acquisition module, a point-to-depth calculation module, a vertex depth calculation module, an offset value module, a linear interpolation module and an alignment module, wherein:
the acquisition module is used for: acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image;
the point-to-depth calculation module is used for: calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters;
the vertex depth calculation module is used for: calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image;
the offset value module is used for: calculating a grid offset value of the grid on the 2D image according to the vertex depth, the rotation parameter and the translation parameter;
the linear interpolation module is used for: performing bilinear interpolation up-sampling on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels;
the alignment module is used for: and aligning the current frame image and the previous frame image according to the target offset value.
Some embodiments of the present application further provide an electronic device, including:
a processor, and a memory communicatively coupled to the processor;
wherein the memory stores instructions executable by the processor to cause the processor to perform a method of aligning image frames in a method class embodiment.
As can be seen from the above technical solutions, the above embodiments provide an image frame alignment method, an image frame alignment device, and an electronic device, where the method includes: acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image; calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters; calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image; calculating a grid offset value of the grid on the 2D image according to the vertex depth, the rotation parameter and the translation parameter; performing bilinear interpolation up-sampling on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels; and aligning the current frame image and the previous frame image according to the target offset value. According to the method, the rotation relation and the translation relation of the front frame image and the rear frame image can be calculated through the inertial sensor, the inertial sensor is irrelevant to the brightness of light rays and the brightness of the environment, the alignment between the image frames can be realized when the light rays are darker, and the accuracy of the alignment result between the image frames is improved.
The foregoing detailed description of the embodiments is merely illustrative of the general principles of the present application and should not be taken in any way as limiting the scope of the invention. Any other embodiments developed in accordance with the present application without inventive effort are within the scope of the present application for those skilled in the art.

Claims (10)

1. A method of aligning image frames, comprising:
acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image;
calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters;
calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image;
calculating a grid offset value of the grid on a 2D image according to the vertex depth, the rotation parameter and the translation parameter;
performing interpolation calculation on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels;
and aligning the current frame image and the previous frame image according to the target offset value.
2. The method of aligning image frames according to claim 1, wherein the step of calculating rotation parameters and translation parameters of the current frame image and the previous frame image by an inertial sensor comprises:
calculating the angular velocity of the inertial sensor according to the equipment parameters, the equipment offset variables and the Gaussian noise of the inertial sensor;
calculating linear acceleration of the inertial sensor according to the rotation variable, the equipment offset variable and the white noise of the inertial sensor;
acquiring a dynamics model of the inertial sensor through the angular velocity, and acquiring a discrete model of the dynamics model through Euler integration;
and inputting a formula for calculating the linear acceleration into the discrete model to obtain the rotation parameter and the translation parameter.
3. The method of aligning image frames of claim 2, further comprising:
calculating a pre-integral of the inertial sensor from the white noise, gaussian noise, the rotation parameter and the translation parameter;
calculating absolute coordinate values of the matching points based on the pre-integral and a GPS (global positioning system) locator;
and calculating an absolute rotation parameter and an absolute translation parameter according to the absolute coordinate values.
4. The method of aligning image frames of claim 3 further comprising:
comparing the pixel change information of the matching point pairs in the current frame image and the previous frame image to determine sparse points and motion information of pixels in the current frame image;
determining grid points in the current frame image according to the sparse points and the motion information;
and constructing grids of the current frame image according to the grid points.
5. A method of aligning image frames in accordance with claim 3 wherein the step of calculating a point pair depth of said matched point pair based on said matched point pair, said rotation parameter and said translation parameter comprises:
calculating a rotation residual error, a speed residual error and a translation residual error of the pose of the GPS positioner relative to the inertial sensor;
generating a cost function according to the rotation residual error, the speed residual error and the translation residual error;
calculating a minimum value of the cost function and a target rotation parameter and a target translation parameter corresponding to the minimum value by using gradient recursive method;
and calculating the point pair depth of the matched point pair according to the target rotation parameter and the target translation parameter.
6. The method of alignment of image frames according to claim 1, wherein the step of computing vertex depths of vertices in a grid from said point-to-depth comprises:
identifying peripheral point pairs within a preset range of the matching point pairs;
determining vertexes in the grid according to the peripheral point pairs;
acquiring the number of the peripheral point pairs and the depth of the peripheral point pairs;
calculating the average depth of the vertexes according to the point pair quantity and the peripheral point pair depth;
the average depth is marked as the vertex depth of the vertex.
7. The method of alignment of image frames according to claim 1, wherein the step of calculating a grid offset value of said grid on a 2D image from said vertex depth, said rotation parameter and said translation parameter comprises:
acquiring pixel coordinates of the current frame image and the previous frame image;
calculating grid coordinates in the current frame image and the previous frame image according to the vertex depth, the pixel coordinates, the rotation parameters and the displacement parameters;
and calculating grid offset values of the grids on the 2D image according to the grid coordinates.
8. The method according to claim 7, wherein the step of performing interpolation calculation on the offset value to obtain a pixel-based target offset value for the current frame image and the previous frame image, comprises:
acquiring the grid offset value;
and performing bilinear interpolation up-sampling on the offset value to obtain a target offset value based on the pixel.
9. The image frame alignment device is characterized by comprising an acquisition module, a point-to-depth calculation module, a vertex depth calculation module, an offset value module, a linear interpolation module and an alignment module;
the acquisition module is used for: acquiring a matching point pair of a current frame image and a previous frame image, wherein the matching point pair is a point pair corresponding to the same pixel position in the current frame image and the previous frame image;
the point-to-depth calculation module is used for: calculating rotation parameters and translation parameters of the current frame image and the previous frame image through an inertial sensor, and calculating the point pair depth of the matching point pair according to the matching point pair, the rotation parameters and the translation parameters;
the vertex depth calculation module is used for: calculating the vertex depth of the vertex in the grid according to the point-to-depth, wherein the grid is obtained by calculating the motion trail of pixels in the current frame image and the previous frame image;
the offset value module is used for: calculating a grid offset value of the grid on a 2D image according to the vertex depth, the rotation parameter and the translation parameter;
the linear interpolation module is used for: performing interpolation calculation on the offset value to obtain a target offset value of the current frame image and the previous frame image based on pixels;
the alignment module is used for: and aligning the current frame image and the previous frame image according to the target offset value.
10. An electronic device, comprising:
a processor;
a memory communicatively coupled to the processor;
wherein the memory stores instructions executable by the processor to cause the processor to perform the method of aligning image frames of any of claims 1-8.
CN202311575413.8A 2023-11-23 2023-11-23 Image frame alignment method and device and electronic equipment Pending CN117808839A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311575413.8A CN117808839A (en) 2023-11-23 2023-11-23 Image frame alignment method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311575413.8A CN117808839A (en) 2023-11-23 2023-11-23 Image frame alignment method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN117808839A true CN117808839A (en) 2024-04-02

Family

ID=90420523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311575413.8A Pending CN117808839A (en) 2023-11-23 2023-11-23 Image frame alignment method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN117808839A (en)

Similar Documents

Publication Publication Date Title
CN112785702B (en) SLAM method based on tight coupling of 2D laser radar and binocular camera
CN110799918B (en) Method, apparatus and computer-readable storage medium for vehicle, and vehicle
US10225473B2 (en) Threshold determination in a RANSAC algorithm
US20180307922A1 (en) Method of detecting obstacle around vehicle
JP4685313B2 (en) Method for processing passive volumetric image of any aspect
US8355565B1 (en) Producing high quality depth maps
US20230419438A1 (en) Extraction of standardized images from a single-view or multi-view capture
EP2053860A1 (en) On-vehicle image processing device and its viewpoint conversion information generation method
US20160292882A1 (en) Method for estimating the speed of movement of a camera
CN104318561A (en) Method for detecting vehicle motion information based on integration of binocular stereoscopic vision and optical flow
Larsson et al. Revisiting radial distortion absolute pose
JP2006053890A (en) Obstacle detection apparatus and method therefor
JP6782903B2 (en) Self-motion estimation system, control method and program of self-motion estimation system
US10229508B2 (en) Dynamic particle filter parameterization
CN109934873B (en) Method, device and equipment for acquiring marked image
CN113379815A (en) Three-dimensional reconstruction method and device based on RGB camera and laser sensor and server
JP2019012915A (en) Image processing device and image conversion method
CN118429524A (en) Binocular stereoscopic vision-based vehicle running environment modeling method and system
CN117808839A (en) Image frame alignment method and device and electronic equipment
CN110942430A (en) Method for improving motion blur robustness of TOF camera
JP2009077022A (en) Driving support system and vehicle
CN116630528A (en) Static scene reconstruction method based on neural network
CN109089100B (en) Method for synthesizing binocular stereo video
CN117115434A (en) Data dividing apparatus and method
CN113628265A (en) Vehicle panoramic point cloud generation method and depth estimation model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: Room 59, 17th Floor, Science and Technology Innovation Building, No. 777 Zhongguan West Road, Zhuangshi Street, Ningbo City, Zhejiang Province, 315201

Applicant after: Aixin Yuanzhi Semiconductor Co.,Ltd.

Address before: Room 59, 17th Floor, Science and Technology Innovation Building, No. 777 Zhongguan West Road, Zhuangshi Street, Zhenhai District, Ningbo City, Zhejiang Province, 315201

Applicant before: Aixin Yuanzhi Semiconductor (Ningbo) Co.,Ltd.

Country or region before: China

CB02 Change of applicant information