Nothing Special   »   [go: up one dir, main page]

CN116309755A - Image registration method, surface normal vector reconstruction method, system and electronic equipment - Google Patents

Image registration method, surface normal vector reconstruction method, system and electronic equipment Download PDF

Info

Publication number
CN116309755A
CN116309755A CN202310325418.9A CN202310325418A CN116309755A CN 116309755 A CN116309755 A CN 116309755A CN 202310325418 A CN202310325418 A CN 202310325418A CN 116309755 A CN116309755 A CN 116309755A
Authority
CN
China
Prior art keywords
optical flow
frame
pixel
flow estimation
target frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310325418.9A
Other languages
Chinese (zh)
Inventor
孔嘉明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Databaker Beijng Technology Co ltd
Original Assignee
Databaker Beijng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Databaker Beijng Technology Co ltd filed Critical Databaker Beijng Technology Co ltd
Priority to CN202310325418.9A priority Critical patent/CN116309755A/en
Publication of CN116309755A publication Critical patent/CN116309755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/30Polynomial surface description
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Multimedia (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an image registration method, a surface normal vector reconstruction method, a system and electronic equipment. The image registration method comprises the following steps: acquiring a plurality of video frames; determining a first optical flow estimation result from a target frame to a reference frame in a plurality of video frames based on an optical flow estimation algorithm; respectively carrying out edge detection on the target frame and the reference frame to respectively obtain an edge image of the target frame and an edge image of the reference frame; determining a second optical flow estimation result from the edge image of the target frame to the edge image of the reference frame using a second loss function and based on an optical flow estimation algorithm, the second loss function including a partial formula for loss of a rate of change of high frequency information in the edge image; and adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame. The scheme has high registration accuracy, small calculated amount and high processing speed, and can effectively realize real-time and accurate registration of a plurality of video frames.

Description

Image registration method, surface normal vector reconstruction method, system and electronic equipment
Technical Field
The present application relates to the field of image processing technology, and in particular, to an image registration method, a surface normal vector reconstruction method, an image registration system, a surface normal vector reconstruction system, an electronic device, and a storage medium.
Background
In many entertainment or industrial image applications, it may be desirable to register different images using an optical flow estimation algorithm. For example, in the case that the target object is an object that may have motion change in a short time, for example, the human body may generate micro motion such as breathing in a short time, an additional calculation needs to be performed on the video frame of the target object to achieve registration. Specifically, for example, an optical flow estimation algorithm may be employed to estimate optical flow between frames, thereby enabling coarse registration between video frames.
However, since the algorithm assumes that the optical flows of all areas in the screen are globally smooth and that no abrupt points are present, a smooth difference process is performed on the optical flows of the entire screen. While the global optical flow between the actual frames is likely not smooth. For example, the target object is an irregular solid object or a human body, and its movement easily causes an invisible area in a front frame among a plurality of captured video frames to be displayed in a rear frame. This results in that the optical flow of this partial region cannot be accurately calculated, and the accuracy of image registration is poor.
In order to solve the problem of registration accuracy, an optical flow estimation algorithm based on a neural network has appeared in recent years, but the computational complexity of the method is very high. Under the same task amount, the consumed hardware resources may be hundreds of times that of the traditional optical flow estimation algorithm, and the method cannot be suitable for scenes with high real-time requirements and large task amount.
Disclosure of Invention
In order to at least partially solve the above-mentioned problems occurring in the prior art, an image registration method, a surface normal vector reconstruction method, an image registration system, a surface normal vector reconstruction system, an electronic device and a storage medium are provided.
According to one aspect of the present application, there is provided an image registration method, including:
acquiring a plurality of video frames;
determining a first optical flow estimation result from a target frame to a reference frame in a plurality of video frames based on an optical flow estimation algorithm;
respectively carrying out edge detection on the target frame and the reference frame to respectively obtain an edge image of the target frame and an edge image of the reference frame;
determining a second optical flow estimation result from the edge image of the target frame to the edge image of the reference frame using a second loss function and based on an optical flow estimation algorithm, wherein the second loss function includes a partial formula of loss with respect to a rate of change of high frequency information in the edge image; and
and adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame.
The second loss function E is illustratively expressed using the following formula 2 (u):
Figure BDA0004153066930000021
Wherein P is 1 (x, y) represents the pixel value of the pixel (x, y) in the edge image of the target frame, P 2 (x, y) represents the pixel value of pixel (x, y) in the edge image of the reference frame, u 2 [x,y]An optical flow displacement matrix representing the second optical flow estimation result of the pixel (x, y), and β and γ represent adjustment coefficients, respectively.
Illustratively, the method further comprises:
determining an occlusion region and a non-occlusion region of the target frame relative to the reference frame based at least on a difference between the first optical flow estimation result and the second optical flow estimation result;
adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame, comprising:
for each first pixel in the non-occlusion region of the target frame, directly adjusting the first pixel according to a first optical flow estimation result of the first pixel;
for each second pixel in the occlusion region of the target frame, the second pixel is adjusted according to the first optical flow estimates of the other pixels in the target frame.
Illustratively, adjusting the second pixel according to the first optical flow estimates of other pixels in the target frame includes:
for each second pixel in the occlusion region of the target frame, the second pixel is adjusted according to the first optical flow estimate of at least one first pixel in the occlusion region of the target frame that is closest to the second pixel.
Illustratively, adjusting the second pixel according to a first optical flow estimate of at least one first pixel in the occlusion region of the target frame that is closest to the second pixel includes:
determining a plurality of first pixels closest to the second pixel in an occlusion region of the target frame;
performing inverse distance weighted average on the determined first optical flow estimation results of the first pixels according to the determined distance between each first pixel and the second pixel, and determining optical flow displacement of the second pixel; and
the second pixel is adjusted according to the optical flow displacement of the second pixel.
Illustratively, determining an occlusion region and a non-occlusion region of the target frame relative to the reference frame based at least on a difference between the first optical flow estimation result and the second optical flow estimation result comprises:
the area where the pixel (x, y) satisfying the following formula is located is determined as an occlusion area of the target frame with respect to the reference frame:
Figure BDA0004153066930000031
wherein u is 1 [x,y]Optical flow displacement matrix representing the first optical flow estimation result of pixel (x, y), u 2 [x,y]An optical flow displacement matrix representing a second optical flow estimation result of the pixel (x, y), η representing the adjustment coefficient, η > 0, t representing the optical flow threshold; and
the area outside the occlusion area is determined as a non-occlusion area.
Illustratively, the method further comprises:
determining each frame of at least part of the plurality of video frames as a pending frame;
for each pending frame, performing the following operations for each target frame of the pending frame,
determining a first optical flow estimation result from the target frame to the undetermined frame based on an optical flow estimation algorithm;
respectively carrying out edge detection on the target frame and the frame to be determined so as to respectively obtain an edge image of the target frame and an edge image of the frame to be determined;
determining a second optical flow estimation result from the edge image of the target frame to the edge image of the frame to be determined by using a second loss function and based on an optical flow estimation algorithm;
determining an occlusion region of the target frame relative to the frame to be determined based at least on a difference between a first optical flow estimation result of the target frame to the frame to be determined and a second optical flow estimation result of the target frame to the frame to be determined;
calculating the area of the shielding area of the target frame relative to the frame to be determined;
calculating the area sum of the shielding area of each target frame of the frame to be determined relative to the frame to be determined; and
the calculated area and the calculated area for each undetermined frame are compared, and the smallest area and the undetermined frame corresponding to the smallest area and the smallest area are determined as the reference frame.
Illustratively, the method further comprises:
an intermediate one of the plurality of video frames is determined as a reference frame.
Illustratively, the optical flow estimation algorithm includes a two-frame differential optical flow estimation algorithm and a dense inverse search optical flow estimation algorithm.
According to another aspect of the present application, there is also provided a surface normal vector reconstruction method, including:
registering a plurality of video frames of the target object by using the image registration method;
and reconstructing a surface normal vector of the target object by using the adjusted video frame.
According to another aspect of the present application, there is also provided an image registration system including:
the acquisition module is used for acquiring a plurality of video frames;
a first determining module, configured to determine a first optical flow estimation result from a target frame to a reference frame in the plurality of video frames based on an optical flow estimation algorithm;
the edge detection module is used for respectively carrying out edge detection on the target frame and the reference frame so as to respectively obtain an edge image of the target frame and an edge image of the reference frame;
a second determination module for determining a second optical flow estimation result from the edge image of the target frame to the edge image of the reference frame using a second loss function and based on an optical flow estimation algorithm, wherein the second loss function includes a partial formula of a loss with respect to a rate of change of high frequency information in the edge image; and
And the adjusting module is used for adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result so as to acquire an adjusted video frame.
According to another aspect of the present application, there is also provided a surface normal vector reconstruction system, including:
the registration module is used for registering a plurality of video frames of the target object by utilizing the image registration method; and
and the reconstruction module is used for reconstructing the surface normal vector of the target object by using the adjusted video frame.
According to another aspect of the present application, there is also provided an electronic device comprising a processor and a memory, wherein the memory stores computer program instructions for performing the above-mentioned image registration method and/or the above-mentioned surface normal vector reconstruction method when the computer program instructions are executed by the processor.
According to another aspect of the present application, there is also provided a storage medium having stored thereon program instructions for performing the above-described image registration method and/or the above-described surface normal vector reconstruction method when executed.
In the above scheme, by acquiring the first optical flow estimation result from each target frame to the reference frame in the plurality of video frames and acquiring the second optical flow estimation result between the respective edge images of the target frame and the reference frame, the accurate registration of each target frame to the reference frame is realized finally based on the results of the two optical flow estimation. Also, since the second loss function employed in determining the second optical flow estimation result includes a partial expression of the loss with respect to the rate of change of the high-frequency information in the two edge images, the optical flow between each target frame and the reference frame for each detailed region can be accurately estimated. In particular, the accuracy of optical flow estimation for a dynamic object photographed at a short distance can be improved, and the accuracy of optical flow estimation for an area where abrupt optical flow exists between video frames can be improved. Moreover, the scheme is close to the calculation amount of the traditional optical flow estimation algorithm, so that the calculation amount is small, the processing speed is high, and real-time and accurate registration of a plurality of video frames can be effectively realized.
In the summary, a series of concepts in a simplified form are introduced, which will be further described in detail in the detailed description section. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Advantages and features of the present application are described in detail below with reference to the accompanying drawings.
Drawings
The following drawings of the present application are included to provide an understanding of the present application as part of the present application. Embodiments of the present application and descriptions thereof are shown in the drawings to explain the principles of the present application. In the drawings of which there are shown,
FIG. 1 shows a schematic flow chart of an image registration method according to one embodiment of the present application;
FIG. 2 illustrates a schematic diagram of the principle of determining a first optical flow estimation result according to one embodiment of the present application;
FIG. 3 illustrates a schematic diagram of the principle of determining a second optical flow estimation result according to one embodiment of the present application;
FIG. 4 shows a schematic flow chart of a surface normal vector reconstruction method according to one embodiment of the present application;
FIG. 5 shows a schematic block diagram of an image registration system according to one embodiment of the present application; and
FIG. 6 shows a schematic block diagram of a surface normal vector reconstruction system according to one embodiment of the present application; and
fig. 7 shows a schematic block diagram of an electronic device according to one embodiment of the present application.
Detailed Description
In the following description, numerous details are provided to provide a thorough understanding of the present application. However, it will be understood by those skilled in the art that the following description illustrates preferred embodiments of the present application by way of example only and that the present application may be practiced without one or more of these details. In addition, some technical features that are known in the art have not been described in detail in order to avoid obscuring the present application.
To at least partially solve the above technical problem, according to a first aspect of the present application, an image registration method is provided. The image registration method can accurately calculate the optical flow among a plurality of video frames of the target object, and further realizes the accurate registration of the plurality of video frames according to the optical flow estimation result. The scheme has small calculated amount and high processing efficiency, so that real-time registration between video frames can be effectively realized.
Fig. 1 shows a schematic flow chart of an image registration method 100 according to one embodiment of the present application. As shown in fig. 1, the image registration method 100 includes the following steps.
In step S110, a plurality of video frames are acquired.
According to embodiments of the present application, the plurality of video frames may be video frames of the target object acquired using any suitable image acquisition device, existing or developed in the future. The target object may be any suitable object. For example, it may be a target human or animal, or other three-dimensional object.
According to the embodiment of the application, the plurality of video frames may be acquired by using the same image acquisition device, or may be acquired by using different image acquisition devices. The plurality of video frames may be consecutive video frames, for example, a plurality of images of the target object consecutively acquired by the image acquisition device. The plurality of video frames may also be discontinuous video frames, for example, video frames that meet a preset time domain requirement and are selected from the continuous video frames. In a specific embodiment, the target object is a target human body or a three-dimensional object, and the plurality of video frames may be a group of continuous video frames of the target object shot by using an image acquisition device in a short distance under the light source environment that the light source irradiates the surface of the target object from different preset angles at different times. It will be appreciated that in this case, the positions of the target objects in the image are substantially the same in the plurality of video frames, but the brightness of each corresponding region is different. Multiple video frames acquired by this approach may facilitate registration using optical flow estimation.
According to the embodiment of the application, each of the plurality of video frames may be an image with any size or resolution, or may be an image meeting a preset resolution requirement. The plurality of video frames may be black and white images or may be color images. The requirements for the plurality of video frames may be set based on actual processing requirements, hardware of the image acquisition device, etc. The plurality of video frames can be original images directly acquired by the image acquisition device, or can be images obtained by preprocessing the original images. The preprocessing operations may include operations that facilitate subsequent processing, such as geometric transformation, filtering, etc., of the original image.
Step S130, determining a first optical flow estimation result from a target frame to a reference frame in the plurality of video frames based on the optical flow estimation algorithm.
It is understood that the reference frame and the target frame in the plurality of video frames may be relatively speaking. The reference frame may be a video frame selected from a plurality of video frames, and the video frames other than the reference frame among the plurality of video frames may be referred to as target frames. For example, the plurality of video frames are 3 consecutive video frames, and the reference frame is the 2 nd frame selected, then the 1 st frame and the 3 rd frame belong to the target frame.
Any suitable method may be employed to determine the reference frame in accordance with embodiments of the present application. Illustratively, the method 100 may further include determining a reference frame from the plurality of video frames. In one example, any one of a plurality of video frames may be determined to be a reference frame. In another example, a particular frame of the plurality of video frames may also be determined as a reference frame. Alternatively, a video frame satisfying the timing requirement among the plurality of video frames may be determined as the reference frame. For example, the method 100 may include step S121 of determining an intermediate one of the plurality of video frames as a reference frame. For example, if the number of the plurality of video frames is 7, the 4 th frame ordered in the photographing timing may be determined as the reference frame. It can be appreciated that the movement of the target object is generally continuous and stable, so that the method of optical flow estimation calculation and subsequent registration of the target frame by using the middle video frame as a reference frame has smaller calculation amount and relatively more accurate registration result. Alternatively, a video frame satisfying a preset image quality requirement may be used as the reference frame. For example, a target recognition algorithm may be used to recognize the region in which the target object is located, and calculate the proportion of the target object region in the image. Then, a video frame with the highest target object ratio among the plurality of video frames may be used as a reference frame. Alternatively, a plurality of video frames meeting the preset timing requirement can be sequentially used as reference frames, and the optical flow from each corresponding target frame to the reference frame can be calculated one by one. And finally screening out a final reference frame according to the optical flow estimation result. Specific implementations of such examples will be described later and are not described in detail herein. Of course, other suitable methods for determining the reference frame may be used.
According to embodiments of the present application, any suitable optical flow estimation algorithm, existing or developed in the future, may be used to determine the optical flow between each target frame to the determined reference frame. By way of example and not limitation, step S130 may calculate the first optical flow estimation result from the target frame to the reference frame using any one of a two-frame differential optical flow estimation algorithm (Lucas Kanade optical flow algorithm, abbreviated as L-K optical flow algorithm) and a dense inverse search optical flow estimation algorithm (Dense Inverse Search-based method, abbreviated as DIS optical flow algorithm). Optionally, an L-K optical flow algorithm may be used to calculate a sparse optical flow estimate between each target frame and the reference frame, and then an interpolation algorithm may be used to obtain a dense optical flow estimate between each target frame and the reference frame. Alternatively, the DIS optical flow algorithm may also be used to directly obtain dense optical flow estimates from each target frame to the reference frame. For the latter, because the DIS optical flow algorithm is an algorithm for balancing optical flow quality and calculation time to a greater extent, the first optical flow estimation result obtained by adopting the scheme has better quality, smaller calculation amount and higher calculation efficiency, so that the first optical flow estimation result with high precision can be ensured to be obtained in real time, and further the real-time registration of a plurality of video frames can be realized.
The process of obtaining an optical flow estimation result is described below taking as an example the direct acquisition of dense optical flow estimates between each target frame and a reference frame using the DIS optical flow algorithm. FIG. 2 illustrates a schematic diagram of the principle of determining a first optical flow estimation result according to one embodiment of the present application. As shown in fig. 2, it can be understood that if the target frame and the reference frame are respectively denoted as I (a) and I (b), and the optical flow between the target frame and the reference frame is denoted as O (a, b), each pixel point in the target frame and the reference frame may have the following relationship: i (a) +o (a, b) =i (b). That is, using the optical flow between two video frames, the target frame can be converted into a reference frame.
Let pixel point P 'in reference frame with pixel position (x, y)' 2 (x, y) may be defined as (x, y) -u from the pixel position on the target frame 1 [x,y]Pixel point P 'of (1)' 1 ((x,y)-u 1 [x,y]) And moved. Wherein u is 1 [x,y]Can be regarded as an optical flow displacement matrix of the pixel points. The optical flow displacement matrix between the target frame and the reference frame can be solved using the following first loss function.
Figure BDA0004153066930000091
It will be appreciated that the first loss function formula described above includes two parts. The first part represents each pixel point of the target frame according to the optical flow displacement matrix u 1 [x,y]After the shifting, the error magnitude of the corresponding pixel point in the reference frame is matched. The second part represents the smoothness of the inside of the optical flow. Assume all The pixels are moved to one direction to approximate displacement, so that the inside of the optical flow can be considered to be smooth, and the optical flow accords with the natural law. Alpha in the formula represents an adjustment coefficient, and alpha is a fixed value and can be arbitrarily set according to actual processing requirements. The loss function setting mode of the DIS optical flow algorithm can balance the optical flow smoothness and the matching degree after image deformation.
By using the method, the optical flow displacement matrix from each target frame to the reference frame in the plurality of video frames can be calculated in turn, so as to obtain a first optical flow estimation result.
Step S150, edge detection is performed on the target frame and the reference frame, respectively, to obtain an edge image of the target frame and an edge image of the reference frame, respectively.
According to the embodiment of the application, any existing or future developed edge detection algorithm can be used to obtain the edge images of the target frame and the reference frame. Illustratively, the edge detection of this step may be implemented using any one of the edge detection algorithms of the Roberts Cross operator, the Prewitt operator, the Sobel operator, the Kirsch operator, and the compass operator based on the first derivative method. Preferably, any one of the edge detection algorithms of Canny operator, laplacian operator and Marr-Hildnth operator based on the second derivative method can be used to realize the edge detection of this step.
In a specific embodiment, a Canny algorithm may be used to perform edge detection on the reference frame and each target frame, so as to obtain an edge image of the reference frame and an edge image of each target frame respectively. Illustratively, the edge images of the reference frame and the target frame may each be a gray-scale image or a black-and-white image with color pixel information removed. The edge image may present characteristic information of the color gradient of the corresponding video frame.
Step S170, a second optical flow estimation result from the edge image of the target frame to the edge image of the reference frame is determined by using the second loss function and based on the optical flow estimation algorithm. Wherein the second loss function comprises a partial equation of the loss of the rate of change of the high frequency information in the edge image.
In this step, the same optical flow estimation algorithm as in step S130 may be used to calculate the optical flow between the edge image of the target frame to the edge image of the reference frame, and obtain a second optical flow estimation result. For example, if the first optical flow estimation result is obtained by using the DIS optical flow algorithm in step S130, the second optical flow estimation result from the edge image of each target frame to the edge image of the reference frame may also be obtained by using the DIS optical flow algorithm in this step. The second optical flow estimation result obtained by adopting the scheme has better quality, smaller calculated amount and higher calculation efficiency, thereby ensuring that the second optical flow estimation result with high precision can be obtained in real time, and further being beneficial to realizing real-time registration of a plurality of video frames.
Unlike the first optical flow estimation result, the second optical flow estimation result is an optical flow between edge images. FIG. 3 illustrates a schematic diagram of the principle of determining a second optical flow estimation result according to one embodiment of the present application. As shown in fig. 3, similar to the procedure of determining the first optical flow estimation result in fig. 2, the edge image of each target frame and the edge image of the reference frame may be respectively denoted as I '(a) and I' (b). Let the optical flow between the edge image of each target frame to the edge image of the reference frame be O' (a, b), then each pixel point in the edge image of the target frame and the edge image of the reference frame may also have a relationship: i ' (a) +o ' (a, b) =i ' (b). That is, the edge image of the target frame can be converted into the edge image of the reference frame using the optical flow between the respective edge images of the two video frames. The use of edge images of two video frames to re-estimate the optical flow can greatly improve the robustness of the optical flow estimation algorithm and the smoothness of the optical flow estimation under the condition of illumination change.
Taking as an example the acquisition of dense optical flow estimation results from the edge image of each target frame to the edge image of the reference frame using the DIS optical flow algorithm. Similar to the method of determining the first optical flow estimation result, it may be assumed that the pixel point P of the pixel position (x, y) in the edge image of the reference frame 2 (x, y) can be defined as (x, y) +u by the pixel position on the edge image of the target frame 2 [x,y]Pixel point P of (1) 1 ((x,y)+u 2 [x,y]) And moved. Wherein u is 2 [x,y]Optical flow displacement matrix capable of being regarded as pixel point. The second loss function may be used to solve for an optical flow displacement matrix between the edge image of each target frame to the edge image of the reference frame.
It should be noted that the second loss function used in this step may be different from the first loss function described above. Wherein, in addition to the loss partial formula indicating the error magnitude of each pixel point in the edge image of the target frame matching each corresponding pixel point in the edge image of the reference frame and the loss partial formula indicating the smoothness inside the optical flow, the partial formula regarding the loss of the change rate of the high-frequency information in the two edge images may be included. In other words, in the process of performing optical flow estimation on the respective edge images of the target frame and the reference frame, matching of detailed textures of the respective regions in the edge images is also fully considered.
For example, in the case where the target object is a person, the plurality of high-definition video frames of the target person each include detailed texture information such as wrinkles and hairs in the hair and clothing of the target person. Alternatively, there may be slight variations in the head of the target person that shake from side to side in multiple video frames, for example, the exposed portions of the ears in the reference frame and the target frame are not identical. In the process of registering a plurality of video frames by using the image registration method, a second optical flow estimation result between the edge image of the target frame and the edge image of the reference frame can be obtained in the step. Due to the addition of the loss fraction of the rate of change of the high frequency information in the edge image in the second loss function of the optical flow estimation, an accurate estimation of the dense optical flow between the edge features of the target frame and the edge features of the reference frame can be achieved. Specifically, for example, it is possible to accurately calculate detailed texture information such as the hair of a target person in a video frame, wrinkles and hair in clothing, and the optical flow of an exposed portion of the five sense organs of the person. Therefore, the defect that the existing optical flow estimation algorithm only estimates the globally smooth optical flow can be overcome, the uneven optical flow between two video frames, such as the optical flow of the exposed part or the shielding part of the five sense organs between the front frame and the rear frame, can be accurately estimated, the accuracy of optical flow estimation of various video frames can be guaranteed to be better, and the calculated amount of optical flow estimation is smaller.
Step S190 adjusts the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame.
As described above, the first optical flow estimation result may be an optical flow estimation result from each target frame to the reference frame obtained by using a conventional optical flow estimation algorithm. The second optical flow estimation result may be an optical flow estimation result between an edge image of each target frame and an edge image of a reference frame, which is obtained by adopting the optical flow estimation algorithm after optimization. It will be appreciated that for the same target frame and reference frame, the first optical flow estimate and the second optical flow estimate therebetween are different. For example, the optical flow displacement matrices obtained by two optical flow estimations are different. The first optical flow estimation result may be a globally smoothed optical flow displacement matrix, and the second optical flow estimation result may be an optical flow displacement matrix that considers inter-frame abrupt regions. In this step, the target frame may be adjusted by integrating the two optical flow estimation results.
It can be appreciated that for a plurality of video frames of a target object captured in a short time, the optical flow abrupt change area of the target object between two adjacent video frames is relatively small, or is only a partial area in the entire image. Illustratively, in this step, an optical flow mutation region between frames may be determined based on a difference of the two optical flow estimation results. Then, the first optical flow estimation result may be used to adjust non-abrupt regions outside the optical flow abrupt regions in the target frame, and align the regions to corresponding regions in the reference frame. For pixels located in the optical flow break-up region, other suitable methods may be employed to adjust the pixels to align the optical flow break-up region to a corresponding region in the reference frame. By way of example and not limitation, the optical flow abrupt region in the target frame may be adjusted based on the first optical flow estimation result of the pixels of the non-abrupt region near the optical flow abrupt region. Finally, the effect of accurately aligning each target frame in the plurality of video frames with the reference frame can be achieved, and therefore accurate registration among the plurality of video frames can be achieved.
It can be appreciated that if the optical flow between the target frame and the reference frame is estimated by using the conventional optical flow estimation algorithm, in the multiple video frames finally registered based on the optical flow, a relatively obvious difference still exists between the target frame and the reference frame, and in particular, the optical flow abrupt change region in the image cannot achieve the effect of registration at all. For example, the degree to which the mouth of the target person is stretched, the degree to which the eyes are open, the fit of hair filaments, or the texture of the garment may be different among multiple video frames of the target person. A plurality of video frames registered by adopting a traditional optical flow estimation algorithm are overlapped and displayed, so that the dynamic changes of the optical flow abrupt change areas can be obviously observed, and the registration effect is poor. The registered multiple video frames obtained by registering based on the two optical flow estimation results in the embodiment of the present application can accurately align all image areas including the optical flow abrupt change area with the reference frame. And overlapping the registered video frames for display, so that the effect of accurate alignment of the whole and the detail can be presented. For example, in the superimposed video, a jog effect such as opening the mouth of the target person is not presented, except for a change in light.
In addition, the surface normal vector of the target object can be reconstructed by using the image registration method in the embodiment of the application, so as to obtain the external contour shape of the target object. In the prior art, the photometric stereo measurement method is a simpler method for acquiring the normal vector of the surface of the target object, and can be realized without adopting expensive shooting equipment. However, the current photometric stereo measurement method is generally only suitable for obtaining the surface normal vector of an absolute static target object, and for a target object which still has dynamic change in a short time, accurate registration cannot be realized, so that the surface normal vector of the target object cannot be reconstructed. According to the embodiment of the application, the image registration method can be loaded in the image registration process of the photometric stereo measurement method, the surface normal vector of the target object is effectively reconstructed, and the reconstruction efficiency and accuracy are high. Therefore, the application range of the photometric stereo measurement method can be improved.
In the above scheme, by acquiring the first optical flow estimation result from each target frame to the reference frame in the plurality of video frames and acquiring the second optical flow estimation result between the respective edge images of the target frame and the reference frame, the accurate registration of each target frame to the reference frame is realized finally based on the results of the two optical flow estimation. Also, since the second loss function employed in determining the second optical flow estimation result includes a partial expression of the loss with respect to the rate of change of the high-frequency information in the two edge images, the optical flow between each target frame and the reference frame for each detailed region can be accurately estimated. In particular, the accuracy of optical flow estimation for a dynamic object photographed at a short distance can be improved, and the accuracy of optical flow estimation for an area where abrupt optical flow exists between video frames can be improved. Moreover, the scheme is close to the calculation amount of the traditional optical flow estimation algorithm, so that the calculation amount is small, the processing speed is high, and real-time and accurate registration of a plurality of video frames can be effectively realized.
It will be appreciated that steps S110 to S190 described above may be performed in a variety of suitable orders of execution. Although steps S110 to S190 shown in fig. 1 are sequentially performed, some of the steps may be performed in other suitable order of execution. For example, step S150 may be performed before step S130, or may be performed simultaneously with step S130. This application is not limited thereto.
Illustratively, the second loss function E is expressed in step S170 using the following equation 2 (u):
Figure BDA0004153066930000131
Wherein P is 1 (x, y) represents the pixel value of the pixel (x, y) in the edge image of the target frame, P 2 (x, y) represents the pixel value of the pixel (x, y) in the edge image of the reference frame. u (u) 2 [x,y]An optical flow displacement matrix representing a second optical flow estimation result for pixel (x, y). Beta and gamma are both adjusting coefficients, usually fixed values, and can be set arbitrarily according to actual processing requirements. The values of the two can be equal or unequal.
As previously described, the DIS optical flow algorithm may be used to obtain edge images of each target frame to a referenceDense optical flow estimation between edge images of frames. It will be appreciated that in the above equation for the loss function, the first loss component
Figure BDA0004153066930000132
Each pixel point in the edge image which can represent the target frame is according to the light flow displacement matrix u 2 [x,y]After the shifting, the error magnitude of the corresponding pixel point in the edge image of the reference frame is matched. Second loss fraction->
Figure BDA0004153066930000133
Is a partial expression of the loss of the change rate of the high-frequency information in the edge image, which represents the magnitude of the matching error of the change rate of the high-frequency information in the two edge images. Third loss fraction->
Figure BDA0004153066930000134
Then the smoothness inside the optical flow is indicated. The smaller the value of the third loss expression, the smoother the optical flow, and the larger the value, the less smooth the optical flow.
It can be understood that in the second loss function, on the basis of ensuring that the quality of the matched image and the smoothness of the optical flow are both better through the first loss partial formula and the third loss partial formula, a penalty term is further set by using the second loss partial formula so as to control the change rate of the high-frequency information in the matched two edge images. That is, for the detail parts in the pictures of the target frame and the reference frame, such as the human skin or clothes, through the second loss partial formula in the second loss function, not only can the accurate matching of the color information of each pixel between the video frames in the calculated optical flow be ensured, but also the color change mode of the pixels can be accurately matched. Thus, an accurate optical flow estimation result can be obtained.
The setting method of the second loss function is simple, the calculated amount is small, and the processing speed is high. The optical flow displacement matrix of each pixel between the edge image of the target frame and the edge image of the reference frame can be calculated more quickly and accurately by using the loss function. Furthermore, real-time and accurate registration of a plurality of video frames can be realized, and the user experience is good.
For example, according to an embodiment of the present application, a reference frame in a plurality of video frames may be first not determined, but a plurality of video frames may be traversed, each video frame is used as a pending reference frame to be processed separately, and then the reference frame may be determined according to a processing result. Specifically, the method 100 may further include steps S121 to S124.
Step S121, each frame in at least part of the plurality of video frames is determined as a pending frame. The pending frame may be a pending reference frame.
Step S122, for each pending frame, the following steps S122.1 to S122.5 are performed for each target frame of the pending frame.
Step S122.1, determining a first optical flow estimation result from the target frame to the pending frame based on the optical flow estimation algorithm. Step S122.2, performing edge detection on the target frame and the frame to be determined respectively, so as to obtain an edge image of the target frame and an edge image of the frame to be determined respectively. Step S122.3, determining a second optical flow estimation result from the edge image of the target frame to the edge image of the pending frame using a second loss function and based on an optical flow estimation algorithm.
It is understood that steps S122.1 to S122.3 are similar to steps S130, S150 and S170, respectively, and those skilled in the art will understand the implementation of this step, and will not be repeated here. For example, the target object is a target face, and the plurality of video frames may be a plurality of images of the target face photographed in succession in a short time, such as 10 video frames photographed in succession. During shooting, there may be a micro-motion phenomenon such as a slight wobble of the head. For example, of the 10 video frames, the above steps S122.1 to S122.3 are performed with the 5 th frame as the pending frame and the other 9 video frames as the target frames in sequence. A first optical flow estimate between the 9 target frames and the 5 th frame may be obtained, and a second optical flow estimate between the 9 target frames and the respective edge images to the 5 th frame may be obtained accordingly.
Step S122.4, determining an occlusion region of the target frame relative to the frame to be determined based on at least a difference between the first optical flow estimation result from the target frame to the frame to be determined and the second optical flow estimation result from the target frame to the frame to be determined. It will be appreciated that the occlusion region of the target frame relative to the frame to be determined may be a region of pixels that are present in the target frame but not in the frame to be determined. Which may represent an abrupt region of optical flow between the target frame and the frame to be defined. For example, in the 10 video frames of the target face, the 5 th frame is a positive face image, and the face in the first frame is slightly rotated to the left, so that the exposed area of the right ear of the target face in the 1 st frame is larger than the exposed area of the right ear of the target face in the 5 th frame, that is, the partial area of the right ear of the target face in the 5 th frame is blocked. The occlusion region of frame 1 with respect to frame 5 may be approximately the region in frame 1 where the right ear of the target face is exposed more than in frame 5. The occlusion region may be calculated from a difference between a first optical flow estimate calculated for the two video frames and a second optical flow estimate calculated for the edge images of the two video frames. Any suitable analysis method may be used to analyze the difference between the two optical flow estimates to determine the occlusion region.
Illustratively, taking optical flow estimation using the DIS optical flow algorithm as an example, the first optical flow estimation result may be a globally smoothed optical flow displacement matrix, and the second optical flow estimation result may be an optical flow displacement matrix that considers inter-frame abrupt regions. In this step, the occlusion region may be determined directly based on the difference matrix of the two optical flow displacement matrices. For example, a first threshold may be set, and an area formed by the corresponding pixels when the value of the determinant of the difference matrix of the two images is larger than the first threshold may be determined as an occlusion area. Alternatively, an index value that can represent optical flow smoothness of each pixel position in the image may be calculated from two optical flow displacement matrices, and an area formed by pixels with poor optical flow smoothness may be determined as an occlusion area of the target frame with respect to the frame to be determined according to the magnitude of the index value.
In a specific example, step S122.4 may include step S122.41.
Step S122.41, determining the area where the pixel (x, y) satisfying the following formula is located in the target frame as the occlusion area of the target frame with respect to the frame to be determined:
Figure BDA0004153066930000151
wherein u is 1 ′[x,y]An optical flow displacement matrix representing a first optical flow estimation result of the pixel (x, y). u (u) 2 ′[x,y]An optical flow displacement matrix representing a second optical flow estimation result for pixel (x, y). η 'represents an adjustment coefficient, η' may be set to any value greater than zero depending on the actual process requirements. T' represents an optical flow threshold, which may also be an empirical value set according to actual requirements.
It will be appreciated that in the above formula, |u 1 ′[x,y]-u 2 ′[x,y]The i represents determinant values of a difference matrix between two optical flow displacement matrices obtained by two optical flow estimates. Which may represent the magnitude of the difference of the two optical flow estimates for each pixel in the video frame.
Figure BDA0004153066930000152
The magnitude of the sum of the rates of change of the two optical flow displacements for each pixel of the statistics is represented, which may represent to some extent the internal smoothness of the optical flow for that pixel, the greater the value the less smooth the optical flow for that pixel. Through the formula, pixels with larger difference of two optical flow estimation results and unsmooth optical flow in the video frame can be screened out, so that the shielding area of the target frame relative to the frame to be determined can be accurately determined.
Step S122.5, calculating the area of the occlusion region of the target frame with respect to the frame to be determined. After determining the occlusion region of each target frame relative to the corresponding frame to be determined by the above steps, the area of the occlusion region may be counted using any suitable method. By way of example and not limitation, the number of pixels contained in the occlusion region may be taken as the area of the occlusion region. Alternatively, the area of the shielding region may be determined in other manners, for example, the area of the smallest circumscribed rectangle of the shielding region may be used as the area of the shielding region.
Step S123, calculating the area sum of each target frame of the pending frame relative to the occlusion region of the pending frame. In the embodiment in which the 5 th frame of the 10 video frames is the pending frame, the area of the shielding area of the remaining 9 video frames corresponding to the 5 th frame may be calculated in the step S122, and the area of the 9 shielding areas may be obtained. In this step, the area of the 9 occlusion regions can be calculated.
Step S124, comparing the calculated area and the calculated area for each undetermined frame, and determining the undetermined frame corresponding to the smallest area and the smallest area as the reference frame. In the above embodiment of 10 video frames in which a plurality of video frames are target faces, a first area calculated with a 1 st frame as a frame to be determined, a second area calculated with a 2 nd frame as a frame to be determined, and a tenth area calculated with a 10 th frame as a frame to be determined … … may be calculated, respectively. The minimum value of these 10 area sums may be found and the pending frame corresponding to the minimum value is determined as the reference frame. For example, if the fifth area sum is the minimum of the 10 area sums, then the 5 th frame may be determined to be the reference frame. Those skilled in the art may understand the implementation manner and various extension manners of this scheme, which are not described herein.
In the above scheme, a plurality of video frames may be traversed, and each video frame is used as a pending frame. And determining a first optical flow estimation result between the target frame of each undetermined frame and the undetermined frame and a second optical flow estimation result between the edge maps of the target frame and the undetermined frame respectively. And then, determining the area of the shielding area and the area of the shielding area of the target frame relative to the frame to be determined based on the difference of the two optical flow estimation results, and further calculating the area sum of the shielding areas of all the target frames relative to the frame to be determined. And determining the smallest undetermined frame as a reference frame by comparing the size of the sum of the areas corresponding to the undetermined frames. By the scheme, the optimal reference frame in the plurality of video frames can be accurately determined, and the target frame can be registered based on the optimal reference frame in a subsequent step. For example, for a plurality of video frames of a face, if the plurality of video frames are respectively video frames of a face captured when the face is gradually changed from a left face to a right face, the video frame corresponding to the positive face may be determined as the reference frame by the above-described scheme for determining the reference frame. Therefore, the scheme is beneficial to improving the accuracy of image registration and guaranteeing the better visual effect of each registered video frame, so that the user experience is better.
In another example, in order to save the amount of calculation, the step S121 may further use a video frame satisfying the preset timing requirement of the plurality of video frames as the pending frame. The video frames meeting the preset time sequence requirement can be video frames collected at the middle time in a plurality of video frames. For example, the plurality of video frames is 11 video frames, each of the 6±2 th video frames may be further used as the pending frame in step S121, that is, the 4 th to 8 th frames may be used as the pending frames to perform the above steps S122 to S124, respectively.
Since dynamic changes of a target object are often stable and continuous among a plurality of video frames of a target object photographed in succession. Thus, the frame to be determined, which typically has the minimum area and corresponds to the minimum area, is typically located near the intermediate frame. According to the scheme, each video frame does not need to be used as a frame to be determined to be calculated, so that the calculated amount can be effectively reduced on the basis of ensuring the precision, and the processing speed is improved.
Illustratively, the method 100 further includes step S180.
Step S180, determining an occlusion region and a non-occlusion region of the target frame relative to the reference frame based on at least the difference between the first optical flow estimation result and the second optical flow estimation result. The method similar to the aforementioned step S122.4 may be used to determine the occlusion region of the target frame with respect to the reference frame, which is not described herein. The non-occlusion region may be other regions in the target frame than the occlusion region. It will be appreciated that the occlusion region may be an abrupt region of optical flow between two video frames, while the non-occlusion region may be a region where optical flow is relatively smooth.
In one example, step S180 determines an occlusion region and a non-occlusion region of the target frame relative to the reference frame based at least on a difference between the first optical flow estimation result and the second optical flow estimation result, including steps S180.1 and S180.2.
In step S180.1, the region where the pixel (x, y) satisfying the following formula is located is determined as the target frame relative to the reference frameOcclusion region of frame:
Figure BDA0004153066930000171
wherein u is 1 [x,y]Optical flow displacement matrix representing the first optical flow estimation result of pixel (x, y), u 2 [x,y]An optical flow displacement matrix representing the second optical flow estimation result of the pixel (x, y), η representing the adjustment coefficient, may be set to an arbitrary value greater than zero according to the actual processing requirements. T represents the optical flow threshold. This step is similar to the scheme of the aforementioned step S122.41, and a person skilled in the art can understand the implementation of this step according to the description thereof, which is not described herein. In step S180.2, an area in the target frame other than the occlusion area is determined as a non-occlusion area, which may be an area of the target frame that is relatively smooth with respect to the optical flow in the reference frame.
It can be understood that through the above scheme, pixels with large difference of the two optical flow estimation results and unsmooth optical flow in the target frame can be screened out, so that the occlusion region of the target frame relative to the reference frame can be accurately determined. In other words, the optical flow abrupt change region in the target frame relative to the reference frame can be determined more accurately. And thus the region of the target frame that is relatively smooth with respect to the optical flow in the reference frame can be accurately determined. In addition, the calculation amount of the scheme is small, and the execution logic is simple.
Step S190 adjusts the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame, including step S191 and step S192.
Step S191, for each first pixel in the non-occlusion region of the target frame, adjusts the first pixel directly according to the first optical flow estimation result of the first pixel. The pixels located in the non-occlusion region may be referred to as first pixels. As described above, the non-occluded region may be a region where the optical flow is relatively smooth. The optical flow of the area can be determined more accurately based on the traditional optical flow estimation algorithm. Thus, each pixel in the region may be registered to a corresponding pixel in the reference frame directly based on the first optical flow estimation result. For example, the optical flow displacement matrix u in the previous example may be based 1 [x,y]Adjustment ofThe position of each first pixel in the target frame. The position coordinates of each first pixel can be substituted into u 1 [x,y]The optical flow displacement of each first pixel is obtained, so that each first pixel can be shifted by a corresponding displacement in the target frame.
Step S192, for each second pixel in the occlusion region of the target frame, adjusts the second pixel according to the first optical flow estimation results of other pixels in the target frame. Each pixel located in the occlusion region may be referred to as a second pixel. Since the occlusion region is an optical flow abrupt region, the corresponding position of the pixel in the reference frame cannot be accurately determined. Thus, the optical flow displacement thereof may be determined based on the first optical flow estimation results of the pixels other than the pixel to register to the reference frame. Alternatively, the optical flow displacement thereof may be determined based on the first optical flow estimation result of the first pixel in the vicinity of the second pixel, and the position of the pixel may be adjusted based on the displacement. Alternatively, the optical flow displacement of the second pixel may be determined based on the first optical flow estimation result of other second pixels near the second pixel, so as to adjust the position of the pixel in the target frame. For example, the optical flow displacements of the first optical flow estimation results of all the second pixels may be averaged, and each of the second pixels may be adjusted based on the average displacement. Of course, other suitable methods for adjusting the second pixels may be used.
In the above-described aspect, the occlusion region and the non-occlusion region of the target frame with respect to the reference frame may be first determined based on a difference between the first optical flow estimation result and the second optical flow estimation result. The pixels in the two regions are then each adjusted using a different method to register the target frame to the reference frame. For each pixel in the smooth optical flow area represented by the non-occlusion area, the first optical flow estimation result is directly adopted for adjustment. And adjusting the light flow abrupt change region represented by the non-occlusion region by using the first light flow estimation results of other pixels. In the scheme, the registration accuracy of the target frame to the reference frame is relatively high, and the calculated amount is small.
Illustratively, step S192 adjusts the second pixel according to the first optical flow estimation results of other pixels in the target frame, including step S192.1.
Step S192.1, for each second pixel in the occlusion region of the target frame, adjusting the second pixel according to the first optical flow estimation result of at least one first pixel in the occlusion region of the target frame that is closest to the second pixel. Alternatively, the second pixel may be adjusted directly based on the first optical flow estimation result of the first pixel nearest to the second pixel. Alternatively, the second pixel may be adjusted based on the first optical flow estimation result of a specific pixel among the plurality of pixels closest to the second pixel. The specific pixel is, for example, the pixel closest to the pixel value of the second pixel. Alternatively, the second pixel may also be adjusted based on the first optical flow estimation result of the plurality of pixels closest to the second pixel. For example, the optical flow displacement of the first optical flow estimation result of the 10 pixels nearest to the second pixel may be weighted and averaged, and the weighted and averaged optical flow displacement may be adjusted as the optical flow displacement of the second pixel.
It will be appreciated that since the optical flow for each first pixel located in the non-occluded region is smoother, the optical flow estimates for these pixels are more accurate. Since the dynamic change of the target object is stable and continuous, two pixels adjacent in the target frame are likely to be adjacent in position in the reference frame. Accordingly, the optical flow of each second pixel may be determined relatively more accurately based on a more accurate optical flow estimation of the pixel nearest to the second pixel. And then, each second pixel is adjusted based on the determined optical flow, so that the shielding area in the target frame can be accurately aligned with the reference frame, and the accuracy of image registration can be remarkably improved. Moreover, the scheme is simple in calculation and small in calculation amount, so that the processing efficiency is high.
Illustratively, step S192.1 adjusts the second pixel according to the first optical flow estimation result of at least one first pixel nearest to the second pixel in the occlusion region of the target frame, including steps S192.11, S192.12, and S192.13.
In step S192.11, a plurality of first pixels closest to the second pixel in the occlusion region of the target frame is determined. The plurality of pixels is, for example, 10 pixels, and 10 first pixels nearest to each second pixel may be determined in this step.
Step S192.12, performing a distance inverse weighted average on the determined first optical flow estimation results of the plurality of first pixels according to the determined distance between each first pixel and the second pixel, and determining the optical flow displacement of the second pixel. For example, the optical flow displacements of the first optical flow estimation results of the 10 first pixels may be subjected to a distance reciprocal weighted average. That is, among the 10 first pixels, the weight of the optical flow displacement of the first pixel which is closer to the second pixel is larger, and the weight of the optical flow displacement of the first pixel which is farther from the second pixel is smaller. Thereby, the accuracy of the optical flow displacement of the determined second pixel can be ensured to a high degree.
Step S192.13, adjusting the second pixel according to the optical flow displacement of the second pixel. For example, the second pixel may be moved to a new position in the target frame in accordance with the optical flow displacement.
The method for adjusting the second pixels is more accurate and reasonable, and the accuracy of image registration can be further improved. In addition, the calculated amount is also small, so that the processing efficiency is also high.
In the prior art, some methods for reconstructing the surface normal vector of the target object by using relatively complex equipment have extremely high implementation cost. While others reconstruct the surface normal vector of the target object by registering multiple video frames of the target object taken sequentially in a short time. Taking the photometric stereo method as an example, each point light source can be sequentially lightened in the shooting process of a target object through an image acquisition device and a group of point light sources with controllable programs, so that a plurality of video frames with different brightness of the target object in different light sources are obtained. And registering the plurality of video frames to obtain the surface normal vector of the target object. However, photometric stereo often requires that the target object be absolutely stationary during the shooting process, otherwise registration would be difficult to achieve and the surface normal vector of the target object would not be reconstructed. Therefore, the prior art uses the method to reconstruct the surface normal vector of a static object, but does not use the method to reconstruct the surface normal vector of a dynamic object such as a human body. This is because dynamic objects cannot remain absolutely stationary in time for a short period of time. Therefore, the application range of the surface normal vector of the target object reconstructed by the photometric stereo method in the prior art is narrow.
According to a second aspect of the present application, there is also provided a surface normal vector reconstruction method. Fig. 4 shows a schematic flow chart of a surface normal vector reconstruction method 400 according to one embodiment of the present application. As shown, the reconstruction method 400 includes step S410 and step S420.
In step S410, a plurality of video frames of the target object are registered using the image registration method 100 described above. Illustratively, the plurality of video frames may be acquired using the same video frame acquisition device. For example, the image registration method can be used for registering continuous video frames of the target face shot by the same camera under the condition of different light source positions.
Step S420, reconstructing a surface normal vector of the target object using the adjusted video frame. For example, a photometric stereo method may be used to reconstruct the surface normal vector of the target face based on the registered continuous video frames of the target face.
It can be appreciated that, since the image registration method according to the embodiment of the present application can achieve the optical flow estimation accuracy of the video frames of the plurality of target objects including the dynamic object, the registration accuracy of the plurality of video frames of the dynamic object can be improved. In particular, the application range of the reconstruction method of the surface normal vector, such as the photometric stereo measurement method, can be remarkably improved. In addition, the method is small in calculated amount and high in reconstruction effect.
According to a third aspect of the present application, there is also provided an image registration system. Fig. 5 shows a schematic block diagram of an image registration system 500 according to one embodiment of the present application. As shown, the image registration system 500 includes an acquisition module 510, a first determination module 520, an edge detection module 530, a second determination module 540, and an adjustment module 550.
An acquisition module 510 is configured to acquire a plurality of video frames.
The first determining module 520 is configured to determine a first optical flow estimation result from a target frame to a reference frame in the plurality of video frames based on an optical flow estimation algorithm.
The edge detection module 530 is configured to perform edge detection on the target frame and the reference frame, respectively, so as to obtain an edge video frame of the target frame and an edge video frame of the reference frame, respectively.
A second determining module 540 for determining a second optical flow estimation result from the edge video frame of the target frame to the edge video frame of the reference frame using a second loss function and based on an optical flow estimation algorithm, wherein the second loss function comprises a partial equation of the loss with respect to the rate of change of the high frequency information in the edge video frame.
An adjustment module 550, configured to adjust the target frame based on the first optical flow estimation result and the second optical flow estimation result, so as to obtain an adjusted video frame.
According to a fourth aspect of the present application, there is also provided a surface normal vector reconstruction system. Fig. 6 shows a schematic block diagram of a surface normal vector reconstruction system 600 according to one embodiment of the present application. As shown, the reconstruction system 600 includes a registration module 610 and a reconstruction module 620.
A registration module 610, configured to register a plurality of video frames of the target object using the image registration method 100. Illustratively, multiple video frames may be acquired using the same video frame acquisition device.
A reconstruction module 620 is configured to reconstruct a surface normal vector of the target object using the adjusted video frames.
According to a fifth aspect of the present application, there is also provided an electronic device. Fig. 7 shows a schematic block diagram of an electronic device 700 according to one embodiment of the present application. As shown, the electronic device 700 includes a processor 710 and a memory 720. Wherein memory 720 has stored therein computer program instructions which, when executed by processor 710, are adapted to the above-described image registration method 100 and/or the above-described surface normal vector reconstruction method 400.
According to a sixth aspect of the present application, there is also provided a storage medium having stored thereon program instructions which, when executed, are adapted to carry out the above-described image registration method 100 and/or the above-described surface normal vector reconstruction method 400.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of the elements is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple elements or components may be combined or integrated into another device, or some features may be omitted or not performed.
Similarly, it should be appreciated that in order to streamline the application and aid in understanding one or more of the various inventive aspects, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of the application. However, the method of this application should not be construed to reflect the following intent: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be combined in any combination, except combinations where the features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some of the modules in an image registration system or surface normal vector reconstruction system according to embodiments of the present application may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present application may also be embodied as device programs (e.g., computer programs and computer program products) for performing part or all of the methods described herein. Such a program embodying the present application may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
The foregoing is merely illustrative of specific embodiments of the present application and the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are intended to be covered by the scope of the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. An image registration method, comprising:
acquiring a plurality of video frames;
determining a first optical flow estimation result from a target frame to a reference frame in the plurality of video frames based on an optical flow estimation algorithm;
respectively carrying out edge detection on the target frame and the reference frame to respectively obtain an edge image of the target frame and an edge image of the reference frame;
determining a second optical flow estimation result from the edge image of the target frame to the edge image of the reference frame using a second loss function and based on the optical flow estimation algorithm, wherein the second loss function includes a partial formula of loss with respect to a rate of change of high frequency information in the edge image; and
and adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result to acquire an adjusted video frame.
2. The image registration method of claim 1, wherein the second loss function E is expressed using the following formula 2 (u):
Figure FDA0004153066920000011
Wherein P is 1 (x, y) represents the pixel value of the pixel (x, y) in the edge image of the target frame, P 2 (x, y) represents a pixel value of a pixel (x, y) in an edge image of the reference frame, u 2 [x,y]Optical flow displacement matrix representing second optical flow estimation result of pixel (x, y), β and γ representing accommodation, respectively Coefficients.
3. The image registration method according to claim 1 or 2, wherein,
the method further comprises the steps of:
determining an occlusion region and a non-occlusion region of the target frame relative to the reference frame based at least on a difference between the first optical flow estimation result and the second optical flow estimation result;
the adjusting the target frame based on the first optical flow estimation result and the second optical flow estimation result to obtain an adjusted video frame includes:
for each first pixel in a non-occlusion region of the target frame, directly adjusting the first pixel according to a first optical flow estimation result of the first pixel;
for each second pixel in the occlusion region of the target frame, the second pixel is adjusted according to the first optical flow estimation results of other pixels in the target frame.
4. The image registration method of claim 3, wherein said adjusting the second pixel according to the first optical flow estimates of other pixels in the target frame includes:
for each second pixel in the occlusion region of the target frame, the second pixel is adjusted according to the first optical flow estimation result of at least one first pixel in the occlusion region of the target frame that is closest to the second pixel.
5. The image registration method of claim 4, wherein the adjusting the second pixel according to the first optical flow estimation of at least one first pixel nearest to the second pixel in the occlusion region of the target frame includes:
determining a plurality of first pixels closest to the second pixel in an occlusion region of the target frame;
performing inverse distance weighted average on the determined first optical flow estimation results of the first pixels according to the determined distance between each first pixel and the second pixel, and determining optical flow displacement of the second pixel; and
the second pixel is adjusted according to the optical flow displacement of the second pixel.
6. The image registration method of claim 3, wherein the determining occlusion and non-occlusion regions of the target frame relative to the reference frame based at least on a difference of the first and second optical flow estimates comprises:
determining an area where a pixel (x, y) satisfying the following formula is located as an occlusion area of the target frame with respect to the reference frame:
Figure FDA0004153066920000021
wherein u is 1 [x,y]Optical flow displacement matrix representing the first optical flow estimation result of pixel (x, y), u 2 [x,y]An optical flow displacement matrix representing a second optical flow estimation result of the pixel (x, y), η representing the adjustment coefficient, η > 0, t representing the optical flow threshold; and
And determining the area outside the shielding area as the non-shielding area.
7. The image registration method of claim 1 or 2, wherein the method further comprises:
determining each frame of at least part of the plurality of video frames as a pending frame;
for each pending frame, performing the following operations for each target frame of the pending frame,
determining a first optical flow estimation result from the target frame to the to-be-determined frame based on the optical flow estimation algorithm;
respectively carrying out edge detection on the target frame and the frame to be determined so as to respectively obtain an edge image of the target frame and an edge image of the frame to be determined;
determining a second optical flow estimation result from the edge image of the target frame to the edge image of the undetermined frame by using the second loss function and based on the optical flow estimation algorithm;
determining an occlusion region of the target frame relative to the frame to be determined based at least on a difference between a first optical flow estimation result of the target frame to the frame to be determined and a second optical flow estimation result of the target frame to the frame to be determined;
calculating the area of the shielding area of the target frame relative to the frame to be determined;
calculating the area sum of the shielding area of each target frame of the frame to be determined relative to the frame to be determined; and
Comparing the calculated area and the calculated area for each undetermined frame, and determining the smallest area and the undetermined frame corresponding to the smallest area and the smallest area as the reference frame.
8. A method of surface normal vector reconstruction, comprising:
registering a plurality of video frames of a target object using the image registration method of any one of claims 1 to 7;
reconstructing a surface normal vector of the target object by using the adjusted video frame.
9. An electronic device comprising a processor and a memory, wherein the memory has stored therein computer program instructions which, when executed by the processor, are adapted to carry out the image registration method of any one of claims 1 to 7 and/or the surface normal vector reconstruction method of claim 8.
10. A storage medium having stored thereon program instructions for performing the image registration method of any one of claims 1 to 7 and/or the surface normal vector reconstruction method of claim 8 when run.
CN202310325418.9A 2023-03-29 2023-03-29 Image registration method, surface normal vector reconstruction method, system and electronic equipment Pending CN116309755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310325418.9A CN116309755A (en) 2023-03-29 2023-03-29 Image registration method, surface normal vector reconstruction method, system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310325418.9A CN116309755A (en) 2023-03-29 2023-03-29 Image registration method, surface normal vector reconstruction method, system and electronic equipment

Publications (1)

Publication Number Publication Date
CN116309755A true CN116309755A (en) 2023-06-23

Family

ID=86835941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310325418.9A Pending CN116309755A (en) 2023-03-29 2023-03-29 Image registration method, surface normal vector reconstruction method, system and electronic equipment

Country Status (1)

Country Link
CN (1) CN116309755A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958203A (en) * 2023-08-01 2023-10-27 北京知存科技有限公司 Image processing method and device, electronic equipment and storage medium
CN116958203B (en) * 2023-08-01 2024-09-13 杭州知存算力科技有限公司 Image processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11562498B2 (en) Systems and methods for hybrid depth regularization
Chen et al. Robust image and video dehazing with visual artifact suppression via gradient residual minimization
Gottfried et al. Computing range flow from multi-modal kinect data
US8451322B2 (en) Imaging system and method
Wasza et al. Real-time preprocessing for dense 3-D range imaging on the GPU: defect interpolation, bilateral temporal averaging and guided filtering
US20090028432A1 (en) Segmentation of Video Sequences
CN109754377A (en) A kind of more exposure image fusion methods
JP2015522198A (en) Depth map generation for images
WO2018053952A1 (en) Video image depth extraction method based on scene sample library
Hebborn et al. Occlusion matting: realistic occlusion handling for augmented reality applications
CN112270688A (en) Foreground extraction method, device, equipment and storage medium
Srinivasan et al. Oriented light-field windows for scene flow
CN110191330A (en) Depth map FPGA implementation method and system based on binocular vision green crop video flowing
CN116309755A (en) Image registration method, surface normal vector reconstruction method, system and electronic equipment
Tan et al. High dynamic range imaging for dynamic scenes with large-scale motions and severe saturation
Zhao et al. Color channel fusion network for low-light image enhancement
US10504235B2 (en) Method for generating three dimensional images
Fathy et al. Benchmarking of pre-processing methods employed in facial image analysis
Vavilin et al. Fast HDR image generation from multi-exposed multiple-view LDR images
Van Vo et al. High dynamic range video synthesis using superpixel-based illuminance-invariant motion estimation
Petrescu et al. Kinect depth inpainting in real time
Martorell et al. Variational Temporal Optical Flow for Multi-exposure Video.
Emberger et al. Low complexity depth map extraction and all-in-focus rendering for close-to-the-pixel embedded platforms
Jayanthi et al. Underwater haze removal using contrast boosted grayscale image
Rao A framework for robust motion estimation and segmentation in adverse outdoor conditions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination