Nothing Special   »   [go: up one dir, main page]

CN115115860A - Image feature point detection matching network based on deep learning - Google Patents

Image feature point detection matching network based on deep learning Download PDF

Info

Publication number
CN115115860A
CN115115860A CN202210856359.3A CN202210856359A CN115115860A CN 115115860 A CN115115860 A CN 115115860A CN 202210856359 A CN202210856359 A CN 202210856359A CN 115115860 A CN115115860 A CN 115115860A
Authority
CN
China
Prior art keywords
matrix
matching
fitting
pixel
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210856359.3A
Other languages
Chinese (zh)
Inventor
罗欣
赖广龄
吴禹萱
韦祖棋
宋依芸
常乐
许文波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangtze River Delta Research Institute of UESTC Huzhou
Original Assignee
Yangtze River Delta Research Institute of UESTC Huzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangtze River Delta Research Institute of UESTC Huzhou filed Critical Yangtze River Delta Research Institute of UESTC Huzhou
Priority to CN202210856359.3A priority Critical patent/CN115115860A/en
Publication of CN115115860A publication Critical patent/CN115115860A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention designs a fusion network based on an improved SuperPoint network and an improved SuperGlue network, wherein the network utilizes a full convolution network to extract image feature points, uses a sub-pixelation module to improve the coordinate precision of the feature points by utilizing neighborhood window information, utilizes an attention mechanism to simulate the process of matching the feature points of human beings after the image feature points and the feature vectors are jointly coded, and adopts a Sinkhorn algorithm to solve the matching relation. The invention designs a self-adaptive space constraint layer, utilizes the space constraint relation to carry out parallel screening and calculation of a plurality of methods on the rough matching point pairs, can self-adaptively judge the space relation among the images and extracts the matched characteristic point pairs from the input images.

Description

Image feature point detection matching network based on deep learning
Technical Field
The invention belongs to the field of computer image processing, and relates to a method for detecting and matching feature points of images based on a deep learning method and outputting a matching feature point pair and a spatial relationship matrix between two images.
Background
The feature point refers to a key point containing texture structure information in the image and a descriptor corresponding to the key point, and the feature point detection is divided into two steps of key point detection and descriptor calculation. The commonly used key point detection methods include Laplacian method, Harris corner detection method, gaussian difference detection method, FAST corner detection method, and the like. The SIFT operator detects key points in the DOG scale pyramid and calculates a 128-dimensional descriptor containing gradient direction information, and the descriptor has scale invariance, rotation invariance and affine invariance and has excellent positioning accuracy, but the calculation process is complex, so the calculation speed is slow. Rublee proposes that improved FAST is used as a key point detector, and an ORB operator of an improved BRIEF descriptor is combined, and the binary descriptor is coded by using neighborhood pixels, so that the method has very high operation speed on the premise of ensuring scale invariance, and is applied to a plurality of real-time tasks. With the rapid development of deep learning technology, researchers also make continuous attempts and exploration in the field of feature point detection. SuperPoint is an automatic-supervision deep learning feature point extraction algorithm, has good image understanding capability and feature point description capability, and has far higher operation speed on a GPU than the traditional feature extraction algorithm. SuperGlue is an attention GNN-based matching network adapted to SuperPoint, which simulates the process of image matching by humans using an attention mechanism. SuperPoint has the disadvantage that the feature point coordinates are integers rather than floating point numbers, and therefore is limited in tasks with higher precision requirements, and SuperGlue has the disadvantage of lacking explicit spatial constraints when using GNN to simulate human vision. At present, the mainstream feature point detection task taking three-dimensional reconstruction as a downstream task still depends on a manual feature detection method represented by SIFT.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention discloses a feature point detection matching method based on deep learning. The invention designs a fusion network based on an improved SuperPoint network and an improved SuperGlue network, wherein the network utilizes a full convolution network to extract image feature points, uses a sub-pixelation module to improve the coordinate precision of the feature points by utilizing neighborhood window information, utilizes an attention mechanism to simulate the process of matching the feature points of human beings after the image feature points and the feature vectors are jointly coded, and adopts a Sinkhorn algorithm to solve the matching relation. The invention designs a self-adaptive space constraint layer, utilizes the space constraint relation to carry out parallel screening and calculation of a plurality of methods on the rough matching point pairs, can self-adaptively judge the space relation among the images and extracts the matched characteristic point pairs from the input images.
The invention adopts the following technical steps:
step 1: and extracting the fine characteristic point coordinates of the image to be matched by using a SuperPoint network with improved sub-pixel precision.
Step 2: extracting characteristic points and descriptor joint coding vectors according to sub-pixel precision characteristic point coordinates
And step 3: and (3) respectively executing the steps 1 and 2 to the two pictures to be matched to obtain two groups of feature point combined vectors, inputting the two groups of vectors into a SuperGlue attention map neural network and an optimal matching layer to obtain a matching relation matrix, wherein the matrix describes a rough matching relation and a corresponding confidence coefficient between feature points. And sorting the matching relations according to the confidence coefficient from high to low.
And 4, step 4: and inputting the coordinates and the matching relation matrix of all the feature points in the two pictures into a space constraint layer designed by the invention, wherein the space constraint layer comprises parallel basic matrix constraint and homography matrix constraint, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.
And 4.1, taking the basic matrix as a fitting model, and fitting all the matching point pairs by a minimum median method to obtain a fitting matrix F1 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
Step 4.2: and (3) performing progressive consistent sampling fitting on all the matching point pairs by taking the basic matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix F2 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
And 4.3, performing minimum median fitting on all the matching point pairs by taking the homography matrix as a fitting model to obtain a fitting matrix H1 and a matching point pair set corresponding to H1 when the projection error threshold is 2 pixels.
Step 4.4: and (3) carrying out progressive consistent sampling fitting on all matching point pairs by taking the homography matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix H2 and a matching point pair set corresponding to H2 when the projection threshold is 2 pixels. Steps 4.1, 4.2, 4.3, 4.4 are run concurrently to save time.
Step 4.5: the homography matrix represents the relation between the points and the projection points, and the distance between the projection points and the matching points is regarded as a projection error. And calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to H1 and H1, calculating the average bidirectional reprojection error corresponding to H2 in the same way, taking the smaller one of the average bidirectional reprojection errors, keeping the corresponding matrix as H, and keeping the corresponding average bidirectional reprojection error as SH.
Step 4.6: and (3) representing the relation between the point and the projection polar line by using a basic matrix, taking the point-line distance between the projection polar line and the matching point as a projection error, calculating an average bidirectional reprojection error according to matching point pairs corresponding to F1 and F1, calculating an average bidirectional reprojection error corresponding to F2 in the same way, taking the smaller of the average bidirectional reprojection errors, keeping the corresponding matrix as F, and keeping the corresponding average bidirectional reprojection error as SF.
Step 4.7: and when SH/(SF + SH) >0.4, reserving the matrix H and outputting the matching point pair set corresponding to H, otherwise, reserving the matrix F and outputting the matching point pair set corresponding to F.
Compared with the prior art, the invention has the beneficial effects that:
(1) in the task of feature detection and matching oriented to three-dimensional reconstruction, the network has more excellent precision performance
(2) Compared with the traditional feature point detection matching methods such as SIFT and ORB, the method has better robustness under the scenes of strong illumination change, large visual angle change and the like.
Drawings
FIG. 1 is a diagram of a network architecture of the present invention
FIG. 2 is a block diagram of a sub-pixelization module according to the present invention
FIG. 3 is a block diagram of a descriptor decoder according to the present invention
FIG. 4 is a diagram of a space constrained layer structure according to the present invention
Detailed description of the preferred embodiment
The invention is further described below with reference to the accompanying drawings.
The invention designs a rapid characteristic point detection matching network framework, improves the sub-pixel precision of a SuperPoint network for characteristic point detection, and adds an adaptive space constraint dynamic progressive consistent sampling module for characteristic point matching to the SuperGlue network. Fig. 1 shows a network structure of the present invention.
Firstly, extracting fine characteristic point coordinates of an image to be matched by using a SuperPoint network with improved sub-pixel precision. The method mainly comprises the following three steps:
1. and inputting the image to be matched into an encoder of a VGG structure of SuperPoint, and extracting a feature map of the image, wherein the encoder comprises 8 convolution layers, 3 pooling layers, a plurality of BN layers and an activation function layer. The convolution layers have the convolution kernel number of 64, 128 and 128 in sequence, are used for extracting features, and the three 2 multiplied by 2 maximum pooling layers sample H multiplied by W pictures to be H/8 in height and W/8 in width. An H multiplied by W image is coded by a coder to obtain an H/8 multiplied by W/8 multiplied by 128 characteristic map;
2. inputting the characteristic diagram obtained in the step 1 into a characteristic point decoder of SuperPoint, wherein the decoder comprises 2 convolutional layers with 256 and 65 channels, a plurality of BN layers and activation function layers, and outputs a tensor of W/8 xH/8 x65. Each 65-dimensional tensor represents 65 cases that the ith pixel in an 8 x 8 pixel window where the original images do not overlap is a feature point or the feature point is not contained in the pixel window. Obtaining normalized probability distribution of 65 conditions through Softmax layer classification, reducing the size to H multiplied by W multiplied by 1 through a Reshape layer to obtain an H multiplied by W scoring graph, wherein each pixel value of the scoring graph is distributed between 0 and 1 and represents the probability that each pixel point on an input image I is a feature point;
3. in the original SuperPoint, the strategy for extracting feature points is to apply non-maximum suppression (NMS) to each NxN window in the output score map, each window only retains one maximum, then threshold judgment is performed on the whole map, and points higher than the threshold are regarded as feature points. This output mode has only integer-level coordinate accuracy and is not differentiable. The invention adds a sub-pixel correction module combining the neighborhood information of the characteristic points to SuperPoint, and the main flow is shown in figure 2. Inputting the score chart obtained in the step 2 into a coordinate sub-pixelization module designed by the invention, wherein the process comprises the following steps: and adopting non-maximum suppression for each non-overlapping 4 x 4 pixel window, and setting the pixel values except the maximum value in each pixel window as 0 to obtain a coarse characteristic point diagram. Points in the coarse feature point diagram which are larger than a certain threshold value are regarded as coarse feature points, for all M coarse feature points, a pixel window of 5 multiplied by 5 is taken in the score map by taking the coordinate of each coarse feature point as the center, deviation expectation of each pixel in the pixel window relative to the center is calculated respectively in the x direction and the y direction by using a Softargmax method, and the deviation expectation and the coarse feature point coordinates are added to obtain M sub-pixilated fine feature point coordinates.
Then, feature points and descriptor joint coding vectors are extracted according to the sub-pixel precision feature point coordinates, as shown in fig. 3. The descriptor decoder receives the H/8 xW/8 x 128 feature graph, outputs an H/8 xW/8 x 256 initial descriptor matrix after multiple convolution, the initial descriptor matrix is interpolated to H xW x 256 by the original SuperPoint network by using a bicubic interpolation method, and then 256 channels are normalized by adopting L2 regularization, 256-dimensional descriptors are calculated for each pixel of the original image I, and actually, it is not necessary to calculate descriptors of non-feature point pixels. The improved descriptor decoder performs M times of bilinear interpolation in the initial descriptor matrix according to M sub-pixel coordinates output by the sub-pixel module to obtain M256-dimensional vectors. The vectors are normalized by L2 to calculate the final 256-dimensional descriptor.
Secondly, the core idea of SuperGlue is to convert the feature point matching problem into the optimal transportation problem of the feature point and descriptor joint coding vector, and use the Sinkhorn algorithm to iteratively solve. In the process, the attention GNN is used for simulating the characteristics of repeated browsing when the human eyes are matched, and the joint matching performance of the position coordinates and the visual descriptors is enhanced by using a cross-attention and self-attention mechanism. The SuperGlue is not subjected to explicit and strict spatial relation constraint in the matching process, the SuperGlue is improved aiming at the problem, a spatial constraint layer is added into the SuperGlue, and the matching precision of the SuperGlue is improved by using a dynamic progressive consistent sampling module with self-adaptive H-F spatial constraint. Fig. 4 is a structural view of the space constraint layer according to the present invention. Respectively executing the two steps to two pictures to be matched to obtain two groups of feature point combined vectors, inputting the two groups of vectors into a SuperGlue attention-driven neural network and an optimal matching layer to obtain a matching relation matrix, describing a rough matching relation and a corresponding confidence coefficient between feature points, and sequencing the matching relations according to the confidence coefficient from high to low; and finally, inputting the coordinates and the matching relation matrixes of all the feature points in the two pictures into a space constraint layer designed by the invention, wherein the space constraint layer comprises parallel basic matrix constraint and homography matrix constraint, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.

Claims (5)

1. An image feature point detection matching network based on deep learning is characterized by comprising the following steps:
the invention adopts the following technical steps:
step 1: and extracting the fine characteristic point coordinates of the image to be matched by using a SuperPoint network with improved sub-pixel precision.
Step 1.1: and inputting the image to be matched into an encoder of a VGG structure of SuperPoint, and extracting a feature map of the image, wherein the encoder comprises 8 convolution layers, 3 pooling layers, a plurality of BN layers and an activation function layer. The convolutional layers are used for extracting features, the number of convolutional kernels is 64, 128 and 128 in sequence, and three 2 x 2 maximum pooling layers are used for down-sampling. An H multiplied by W image is coded by a coder to obtain an H/8 multiplied by W/8 multiplied by 128 characteristic map.
Step 1.2: and (3) inputting the feature map obtained in the step 1.1 into a feature point decoder of SuperPoint, wherein the decoder comprises 2 convolutional layers with 256 and 65 channels respectively, a plurality of BN layers and activation function layers, and outputs a tensor of W/8 multiplied by H/8 multiplied by 65. The normalized probability distributions of 65 cases are obtained by Softmax layer classification, and the size is reduced to H multiplied by W multiplied by 1 by a Reshape layer, so that a H multiplied by W score map is obtained.
Step 1.3: inputting the score map obtained in the step 1.2 into a coordinate sub-pixelization module designed by the invention, wherein the process comprises the following steps: and adopting non-maximum suppression for each non-overlapping 4 x 4 pixel window, and setting the pixel values except the maximum value in each pixel window as 0 to obtain a coarse characteristic point diagram. Taking a 5 × 5 pixel window in the score map by taking the coordinate of each coarse feature point as the center for all M coarse feature points, respectively calculating deviation expectation of each pixel in the pixel window relative to the center for the x direction and the y direction by using a Softargmax method, and adding the deviation expectation and the coarse feature point coordinates to obtain M sub-pixilated fine feature point coordinates.
Step 2: extracting characteristic points and descriptor joint coding vectors according to sub-pixel precision characteristic point coordinates
Step 2.1: inputting the characteristic diagram obtained in the step 1.1 into a descriptor decoder of SuperPoint, and outputting an H/8 xW/8 x 256 descriptor matrix after multiple convolution.
Step 2.2: and (3) performing M times of bicubic interpolation on the descriptor matrix obtained in the step (2.1) according to the M sub-pixel level coordinates output in the step (1.3) to obtain M256-dimensional vectors, and performing L2 regularization on the vectors to obtain 256-dimensional descriptors corresponding to the sub-pixel feature points.
Step 2.3: and (3) combining the fine feature point coordinates obtained in the step (1.2) with the descriptor vectors obtained in the step (2.2) to obtain feature point combined vectors.
And step 3: and (3) respectively executing the steps 1 and 2 to two pictures to be matched to obtain two groups of feature point joint vectors, and inputting the two groups of vectors into the SuperGlue attention-force diagram neural network and the optimal matching layer to obtain a matching relation matrix. And sorting the matching relations according to the confidence coefficient from high to low.
And 4, step 4: and inputting the coordinates and the matching relation matrix of all the feature points in the two pictures into a space constraint layer designed by the invention, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.
And 4.1, taking the basic matrix as a fitting model, and fitting all the matching point pairs by a minimum median method to obtain a fitting matrix F1 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
Step 4.2: and (3) performing progressive consistent sampling fitting on all the matching point pairs by taking the basic matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix F2 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
And 4.3, performing minimum median fitting on all the matching point pairs by taking the homography matrix as a fitting model to obtain a fitting matrix H1 and a matching point pair set corresponding to H1 when the projection error threshold is 2 pixels.
Step 4.4: and (3) carrying out progressive consistent sampling fitting on all matching point pairs by taking the homography matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix H2 and a matching point pair set corresponding to H2 when the projection threshold is 2 pixels. Steps 4.1, 4.2, 4.3, 4.4 are run concurrently to save time.
Step 4.5: and calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to H1 and H1, calculating the average bidirectional reprojection error corresponding to H2 in the same way, taking the smaller one of the average bidirectional reprojection errors, keeping the corresponding matrix as H, and keeping the corresponding average bidirectional reprojection error as SH.
Step 4.6: and calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to the F1 and the F1, calculating the average bidirectional reprojection error corresponding to the F2 in the same way, taking the smaller of the average bidirectional reprojection errors, keeping the corresponding matrix as F, and keeping the average bidirectional reprojection error as SF.
Step 4.7: and when SH/(SF + SH) >0.4, reserving the matrix H and outputting the matching point pair set corresponding to H, otherwise, reserving the matrix F and outputting the matching point pair set corresponding to F.
2. The method of claim 1, wherein step 1 uses a sub-pixelization coordinate module to extract refined feature point coordinates, so that the network has more accurate feature extraction capability.
3. The method as claimed in claim 1, wherein after the descriptor matrix is obtained in step 2, the refined coordinates obtained in step 1 are substituted, the corresponding sub-pixel descriptors are obtained by bilinear interpolation and L2 regularization, and the coordinates and the descriptors are combined to obtain the joint coding vector.
4. The method of claim 1, wherein after the matching relationship matrix is obtained in step 3, the matching point pairs are sorted according to the confidence level.
5. The method as claimed in claim 1, wherein the spatial constraint layer designed in step 4 uses a basic matrix and a homography matrix as fitting models in parallel, first, fitting is performed according to the iteration number of a progressive consistent sampling module dynamically set according to the degree of the first 30% confidence, then, fitting is performed in parallel by using a minimum median method, the optimal fitting result is retained by using the bidirectional reprojection error of the interior point, and a correct fitting model with a self-adaptive discriminant formula is designed.
CN202210856359.3A 2022-07-20 2022-07-20 Image feature point detection matching network based on deep learning Pending CN115115860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210856359.3A CN115115860A (en) 2022-07-20 2022-07-20 Image feature point detection matching network based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210856359.3A CN115115860A (en) 2022-07-20 2022-07-20 Image feature point detection matching network based on deep learning

Publications (1)

Publication Number Publication Date
CN115115860A true CN115115860A (en) 2022-09-27

Family

ID=83333695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210856359.3A Pending CN115115860A (en) 2022-07-20 2022-07-20 Image feature point detection matching network based on deep learning

Country Status (1)

Country Link
CN (1) CN115115860A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664643A (en) * 2023-06-28 2023-08-29 哈尔滨市科佳通用机电股份有限公司 Railway train image registration method and equipment based on SuperPoint algorithm

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664643A (en) * 2023-06-28 2023-08-29 哈尔滨市科佳通用机电股份有限公司 Railway train image registration method and equipment based on SuperPoint algorithm
CN116664643B (en) * 2023-06-28 2024-08-13 哈尔滨市科佳通用机电股份有限公司 Railway train image registration method and equipment based on SuperPoint algorithm

Similar Documents

Publication Publication Date Title
CN110738697B (en) Monocular depth estimation method based on deep learning
Jiang et al. Edge-enhanced GAN for remote sensing image superresolution
CN110335290B (en) Twin candidate region generation network target tracking method based on attention mechanism
CN110533721B (en) Indoor target object 6D attitude estimation method based on enhanced self-encoder
CN111612807B (en) Small target image segmentation method based on scale and edge information
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN108921926B (en) End-to-end three-dimensional face reconstruction method based on single image
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN110070091B (en) Semantic segmentation method and system based on dynamic interpolation reconstruction and used for street view understanding
CN114187450A (en) Remote sensing image semantic segmentation method based on deep learning
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN114463492B (en) Self-adaptive channel attention three-dimensional reconstruction method based on deep learning
CN111899295B (en) Monocular scene depth prediction method based on deep learning
CN115082293A (en) Image registration method based on Swin transducer and CNN double-branch coupling
CN115359372A (en) Unmanned aerial vehicle video moving object detection method based on optical flow network
CN113538527B (en) Efficient lightweight optical flow estimation method, storage medium and device
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning
CN113436237A (en) High-efficient measurement system of complicated curved surface based on gaussian process migration learning
CN112581423A (en) Neural network-based rapid detection method for automobile surface defects
CN115546273A (en) Scene structure depth estimation method for indoor fisheye image
CN115115860A (en) Image feature point detection matching network based on deep learning
CN117975469A (en) Document image shape correction method and system based on deep learning
CN117593187A (en) Remote sensing image super-resolution reconstruction method based on meta-learning and transducer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination