CN115115860A - Image feature point detection matching network based on deep learning - Google Patents
Image feature point detection matching network based on deep learning Download PDFInfo
- Publication number
- CN115115860A CN115115860A CN202210856359.3A CN202210856359A CN115115860A CN 115115860 A CN115115860 A CN 115115860A CN 202210856359 A CN202210856359 A CN 202210856359A CN 115115860 A CN115115860 A CN 115115860A
- Authority
- CN
- China
- Prior art keywords
- matrix
- matching
- fitting
- pixel
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims description 17
- 238000013135 deep learning Methods 0.000 title claims description 7
- 238000000034 method Methods 0.000 claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 19
- 229920001651 Cyanoacrylate Polymers 0.000 claims abstract description 13
- 239000004830 Super Glue Substances 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 49
- 230000002457 bidirectional effect Effects 0.000 claims description 20
- 238000010586 diagram Methods 0.000 claims description 10
- 238000005070 sampling Methods 0.000 claims description 8
- 230000000750 progressive effect Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 238000011176 pooling Methods 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 2
- 230000000717 retained effect Effects 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 5
- 241000282414 Homo sapiens Species 0.000 abstract description 4
- 230000007246 mechanism Effects 0.000 abstract description 4
- 239000000284 extract Substances 0.000 abstract description 2
- 230000004927 fusion Effects 0.000 abstract description 2
- 238000012216 screening Methods 0.000 abstract description 2
- 230000008859 change Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/757—Matching configurations of points or features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention designs a fusion network based on an improved SuperPoint network and an improved SuperGlue network, wherein the network utilizes a full convolution network to extract image feature points, uses a sub-pixelation module to improve the coordinate precision of the feature points by utilizing neighborhood window information, utilizes an attention mechanism to simulate the process of matching the feature points of human beings after the image feature points and the feature vectors are jointly coded, and adopts a Sinkhorn algorithm to solve the matching relation. The invention designs a self-adaptive space constraint layer, utilizes the space constraint relation to carry out parallel screening and calculation of a plurality of methods on the rough matching point pairs, can self-adaptively judge the space relation among the images and extracts the matched characteristic point pairs from the input images.
Description
Technical Field
The invention belongs to the field of computer image processing, and relates to a method for detecting and matching feature points of images based on a deep learning method and outputting a matching feature point pair and a spatial relationship matrix between two images.
Background
The feature point refers to a key point containing texture structure information in the image and a descriptor corresponding to the key point, and the feature point detection is divided into two steps of key point detection and descriptor calculation. The commonly used key point detection methods include Laplacian method, Harris corner detection method, gaussian difference detection method, FAST corner detection method, and the like. The SIFT operator detects key points in the DOG scale pyramid and calculates a 128-dimensional descriptor containing gradient direction information, and the descriptor has scale invariance, rotation invariance and affine invariance and has excellent positioning accuracy, but the calculation process is complex, so the calculation speed is slow. Rublee proposes that improved FAST is used as a key point detector, and an ORB operator of an improved BRIEF descriptor is combined, and the binary descriptor is coded by using neighborhood pixels, so that the method has very high operation speed on the premise of ensuring scale invariance, and is applied to a plurality of real-time tasks. With the rapid development of deep learning technology, researchers also make continuous attempts and exploration in the field of feature point detection. SuperPoint is an automatic-supervision deep learning feature point extraction algorithm, has good image understanding capability and feature point description capability, and has far higher operation speed on a GPU than the traditional feature extraction algorithm. SuperGlue is an attention GNN-based matching network adapted to SuperPoint, which simulates the process of image matching by humans using an attention mechanism. SuperPoint has the disadvantage that the feature point coordinates are integers rather than floating point numbers, and therefore is limited in tasks with higher precision requirements, and SuperGlue has the disadvantage of lacking explicit spatial constraints when using GNN to simulate human vision. At present, the mainstream feature point detection task taking three-dimensional reconstruction as a downstream task still depends on a manual feature detection method represented by SIFT.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention discloses a feature point detection matching method based on deep learning. The invention designs a fusion network based on an improved SuperPoint network and an improved SuperGlue network, wherein the network utilizes a full convolution network to extract image feature points, uses a sub-pixelation module to improve the coordinate precision of the feature points by utilizing neighborhood window information, utilizes an attention mechanism to simulate the process of matching the feature points of human beings after the image feature points and the feature vectors are jointly coded, and adopts a Sinkhorn algorithm to solve the matching relation. The invention designs a self-adaptive space constraint layer, utilizes the space constraint relation to carry out parallel screening and calculation of a plurality of methods on the rough matching point pairs, can self-adaptively judge the space relation among the images and extracts the matched characteristic point pairs from the input images.
The invention adopts the following technical steps:
step 1: and extracting the fine characteristic point coordinates of the image to be matched by using a SuperPoint network with improved sub-pixel precision.
Step 2: extracting characteristic points and descriptor joint coding vectors according to sub-pixel precision characteristic point coordinates
And step 3: and (3) respectively executing the steps 1 and 2 to the two pictures to be matched to obtain two groups of feature point combined vectors, inputting the two groups of vectors into a SuperGlue attention map neural network and an optimal matching layer to obtain a matching relation matrix, wherein the matrix describes a rough matching relation and a corresponding confidence coefficient between feature points. And sorting the matching relations according to the confidence coefficient from high to low.
And 4, step 4: and inputting the coordinates and the matching relation matrix of all the feature points in the two pictures into a space constraint layer designed by the invention, wherein the space constraint layer comprises parallel basic matrix constraint and homography matrix constraint, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.
And 4.1, taking the basic matrix as a fitting model, and fitting all the matching point pairs by a minimum median method to obtain a fitting matrix F1 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
Step 4.2: and (3) performing progressive consistent sampling fitting on all the matching point pairs by taking the basic matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix F2 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
And 4.3, performing minimum median fitting on all the matching point pairs by taking the homography matrix as a fitting model to obtain a fitting matrix H1 and a matching point pair set corresponding to H1 when the projection error threshold is 2 pixels.
Step 4.4: and (3) carrying out progressive consistent sampling fitting on all matching point pairs by taking the homography matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix H2 and a matching point pair set corresponding to H2 when the projection threshold is 2 pixels. Steps 4.1, 4.2, 4.3, 4.4 are run concurrently to save time.
Step 4.5: the homography matrix represents the relation between the points and the projection points, and the distance between the projection points and the matching points is regarded as a projection error. And calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to H1 and H1, calculating the average bidirectional reprojection error corresponding to H2 in the same way, taking the smaller one of the average bidirectional reprojection errors, keeping the corresponding matrix as H, and keeping the corresponding average bidirectional reprojection error as SH.
Step 4.6: and (3) representing the relation between the point and the projection polar line by using a basic matrix, taking the point-line distance between the projection polar line and the matching point as a projection error, calculating an average bidirectional reprojection error according to matching point pairs corresponding to F1 and F1, calculating an average bidirectional reprojection error corresponding to F2 in the same way, taking the smaller of the average bidirectional reprojection errors, keeping the corresponding matrix as F, and keeping the corresponding average bidirectional reprojection error as SF.
Step 4.7: and when SH/(SF + SH) >0.4, reserving the matrix H and outputting the matching point pair set corresponding to H, otherwise, reserving the matrix F and outputting the matching point pair set corresponding to F.
Compared with the prior art, the invention has the beneficial effects that:
(1) in the task of feature detection and matching oriented to three-dimensional reconstruction, the network has more excellent precision performance
(2) Compared with the traditional feature point detection matching methods such as SIFT and ORB, the method has better robustness under the scenes of strong illumination change, large visual angle change and the like.
Drawings
FIG. 1 is a diagram of a network architecture of the present invention
FIG. 2 is a block diagram of a sub-pixelization module according to the present invention
FIG. 3 is a block diagram of a descriptor decoder according to the present invention
FIG. 4 is a diagram of a space constrained layer structure according to the present invention
Detailed description of the preferred embodiment
The invention is further described below with reference to the accompanying drawings.
The invention designs a rapid characteristic point detection matching network framework, improves the sub-pixel precision of a SuperPoint network for characteristic point detection, and adds an adaptive space constraint dynamic progressive consistent sampling module for characteristic point matching to the SuperGlue network. Fig. 1 shows a network structure of the present invention.
Firstly, extracting fine characteristic point coordinates of an image to be matched by using a SuperPoint network with improved sub-pixel precision. The method mainly comprises the following three steps:
1. and inputting the image to be matched into an encoder of a VGG structure of SuperPoint, and extracting a feature map of the image, wherein the encoder comprises 8 convolution layers, 3 pooling layers, a plurality of BN layers and an activation function layer. The convolution layers have the convolution kernel number of 64, 128 and 128 in sequence, are used for extracting features, and the three 2 multiplied by 2 maximum pooling layers sample H multiplied by W pictures to be H/8 in height and W/8 in width. An H multiplied by W image is coded by a coder to obtain an H/8 multiplied by W/8 multiplied by 128 characteristic map;
2. inputting the characteristic diagram obtained in the step 1 into a characteristic point decoder of SuperPoint, wherein the decoder comprises 2 convolutional layers with 256 and 65 channels, a plurality of BN layers and activation function layers, and outputs a tensor of W/8 xH/8 x65. Each 65-dimensional tensor represents 65 cases that the ith pixel in an 8 x 8 pixel window where the original images do not overlap is a feature point or the feature point is not contained in the pixel window. Obtaining normalized probability distribution of 65 conditions through Softmax layer classification, reducing the size to H multiplied by W multiplied by 1 through a Reshape layer to obtain an H multiplied by W scoring graph, wherein each pixel value of the scoring graph is distributed between 0 and 1 and represents the probability that each pixel point on an input image I is a feature point;
3. in the original SuperPoint, the strategy for extracting feature points is to apply non-maximum suppression (NMS) to each NxN window in the output score map, each window only retains one maximum, then threshold judgment is performed on the whole map, and points higher than the threshold are regarded as feature points. This output mode has only integer-level coordinate accuracy and is not differentiable. The invention adds a sub-pixel correction module combining the neighborhood information of the characteristic points to SuperPoint, and the main flow is shown in figure 2. Inputting the score chart obtained in the step 2 into a coordinate sub-pixelization module designed by the invention, wherein the process comprises the following steps: and adopting non-maximum suppression for each non-overlapping 4 x 4 pixel window, and setting the pixel values except the maximum value in each pixel window as 0 to obtain a coarse characteristic point diagram. Points in the coarse feature point diagram which are larger than a certain threshold value are regarded as coarse feature points, for all M coarse feature points, a pixel window of 5 multiplied by 5 is taken in the score map by taking the coordinate of each coarse feature point as the center, deviation expectation of each pixel in the pixel window relative to the center is calculated respectively in the x direction and the y direction by using a Softargmax method, and the deviation expectation and the coarse feature point coordinates are added to obtain M sub-pixilated fine feature point coordinates.
Then, feature points and descriptor joint coding vectors are extracted according to the sub-pixel precision feature point coordinates, as shown in fig. 3. The descriptor decoder receives the H/8 xW/8 x 128 feature graph, outputs an H/8 xW/8 x 256 initial descriptor matrix after multiple convolution, the initial descriptor matrix is interpolated to H xW x 256 by the original SuperPoint network by using a bicubic interpolation method, and then 256 channels are normalized by adopting L2 regularization, 256-dimensional descriptors are calculated for each pixel of the original image I, and actually, it is not necessary to calculate descriptors of non-feature point pixels. The improved descriptor decoder performs M times of bilinear interpolation in the initial descriptor matrix according to M sub-pixel coordinates output by the sub-pixel module to obtain M256-dimensional vectors. The vectors are normalized by L2 to calculate the final 256-dimensional descriptor.
Secondly, the core idea of SuperGlue is to convert the feature point matching problem into the optimal transportation problem of the feature point and descriptor joint coding vector, and use the Sinkhorn algorithm to iteratively solve. In the process, the attention GNN is used for simulating the characteristics of repeated browsing when the human eyes are matched, and the joint matching performance of the position coordinates and the visual descriptors is enhanced by using a cross-attention and self-attention mechanism. The SuperGlue is not subjected to explicit and strict spatial relation constraint in the matching process, the SuperGlue is improved aiming at the problem, a spatial constraint layer is added into the SuperGlue, and the matching precision of the SuperGlue is improved by using a dynamic progressive consistent sampling module with self-adaptive H-F spatial constraint. Fig. 4 is a structural view of the space constraint layer according to the present invention. Respectively executing the two steps to two pictures to be matched to obtain two groups of feature point combined vectors, inputting the two groups of vectors into a SuperGlue attention-driven neural network and an optimal matching layer to obtain a matching relation matrix, describing a rough matching relation and a corresponding confidence coefficient between feature points, and sequencing the matching relations according to the confidence coefficient from high to low; and finally, inputting the coordinates and the matching relation matrixes of all the feature points in the two pictures into a space constraint layer designed by the invention, wherein the space constraint layer comprises parallel basic matrix constraint and homography matrix constraint, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.
Claims (5)
1. An image feature point detection matching network based on deep learning is characterized by comprising the following steps:
the invention adopts the following technical steps:
step 1: and extracting the fine characteristic point coordinates of the image to be matched by using a SuperPoint network with improved sub-pixel precision.
Step 1.1: and inputting the image to be matched into an encoder of a VGG structure of SuperPoint, and extracting a feature map of the image, wherein the encoder comprises 8 convolution layers, 3 pooling layers, a plurality of BN layers and an activation function layer. The convolutional layers are used for extracting features, the number of convolutional kernels is 64, 128 and 128 in sequence, and three 2 x 2 maximum pooling layers are used for down-sampling. An H multiplied by W image is coded by a coder to obtain an H/8 multiplied by W/8 multiplied by 128 characteristic map.
Step 1.2: and (3) inputting the feature map obtained in the step 1.1 into a feature point decoder of SuperPoint, wherein the decoder comprises 2 convolutional layers with 256 and 65 channels respectively, a plurality of BN layers and activation function layers, and outputs a tensor of W/8 multiplied by H/8 multiplied by 65. The normalized probability distributions of 65 cases are obtained by Softmax layer classification, and the size is reduced to H multiplied by W multiplied by 1 by a Reshape layer, so that a H multiplied by W score map is obtained.
Step 1.3: inputting the score map obtained in the step 1.2 into a coordinate sub-pixelization module designed by the invention, wherein the process comprises the following steps: and adopting non-maximum suppression for each non-overlapping 4 x 4 pixel window, and setting the pixel values except the maximum value in each pixel window as 0 to obtain a coarse characteristic point diagram. Taking a 5 × 5 pixel window in the score map by taking the coordinate of each coarse feature point as the center for all M coarse feature points, respectively calculating deviation expectation of each pixel in the pixel window relative to the center for the x direction and the y direction by using a Softargmax method, and adding the deviation expectation and the coarse feature point coordinates to obtain M sub-pixilated fine feature point coordinates.
Step 2: extracting characteristic points and descriptor joint coding vectors according to sub-pixel precision characteristic point coordinates
Step 2.1: inputting the characteristic diagram obtained in the step 1.1 into a descriptor decoder of SuperPoint, and outputting an H/8 xW/8 x 256 descriptor matrix after multiple convolution.
Step 2.2: and (3) performing M times of bicubic interpolation on the descriptor matrix obtained in the step (2.1) according to the M sub-pixel level coordinates output in the step (1.3) to obtain M256-dimensional vectors, and performing L2 regularization on the vectors to obtain 256-dimensional descriptors corresponding to the sub-pixel feature points.
Step 2.3: and (3) combining the fine feature point coordinates obtained in the step (1.2) with the descriptor vectors obtained in the step (2.2) to obtain feature point combined vectors.
And step 3: and (3) respectively executing the steps 1 and 2 to two pictures to be matched to obtain two groups of feature point joint vectors, and inputting the two groups of vectors into the SuperGlue attention-force diagram neural network and the optimal matching layer to obtain a matching relation matrix. And sorting the matching relations according to the confidence coefficient from high to low.
And 4, step 4: and inputting the coordinates and the matching relation matrix of all the feature points in the two pictures into a space constraint layer designed by the invention, and calculating a homography matrix score and a reprojected H-F space constraint model according to the bidirectional reprojection error.
And 4.1, taking the basic matrix as a fitting model, and fitting all the matching point pairs by a minimum median method to obtain a fitting matrix F1 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
Step 4.2: and (3) performing progressive consistent sampling fitting on all the matching point pairs by taking the basic matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix F2 and a matching point pair set corresponding to F1 when the projection error threshold is 2 pixels.
And 4.3, performing minimum median fitting on all the matching point pairs by taking the homography matrix as a fitting model to obtain a fitting matrix H1 and a matching point pair set corresponding to H1 when the projection error threshold is 2 pixels.
Step 4.4: and (3) carrying out progressive consistent sampling fitting on all matching point pairs by taking the homography matrix as a fitting model, and dynamically allocating iteration times to be 150-300 according to the average confidence coefficient of the matching point pairs which are 30% before the confidence coefficient ranking to obtain a fitting matrix H2 and a matching point pair set corresponding to H2 when the projection threshold is 2 pixels. Steps 4.1, 4.2, 4.3, 4.4 are run concurrently to save time.
Step 4.5: and calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to H1 and H1, calculating the average bidirectional reprojection error corresponding to H2 in the same way, taking the smaller one of the average bidirectional reprojection errors, keeping the corresponding matrix as H, and keeping the corresponding average bidirectional reprojection error as SH.
Step 4.6: and calculating the average bidirectional reprojection error according to the matching point pair sets corresponding to the F1 and the F1, calculating the average bidirectional reprojection error corresponding to the F2 in the same way, taking the smaller of the average bidirectional reprojection errors, keeping the corresponding matrix as F, and keeping the average bidirectional reprojection error as SF.
Step 4.7: and when SH/(SF + SH) >0.4, reserving the matrix H and outputting the matching point pair set corresponding to H, otherwise, reserving the matrix F and outputting the matching point pair set corresponding to F.
2. The method of claim 1, wherein step 1 uses a sub-pixelization coordinate module to extract refined feature point coordinates, so that the network has more accurate feature extraction capability.
3. The method as claimed in claim 1, wherein after the descriptor matrix is obtained in step 2, the refined coordinates obtained in step 1 are substituted, the corresponding sub-pixel descriptors are obtained by bilinear interpolation and L2 regularization, and the coordinates and the descriptors are combined to obtain the joint coding vector.
4. The method of claim 1, wherein after the matching relationship matrix is obtained in step 3, the matching point pairs are sorted according to the confidence level.
5. The method as claimed in claim 1, wherein the spatial constraint layer designed in step 4 uses a basic matrix and a homography matrix as fitting models in parallel, first, fitting is performed according to the iteration number of a progressive consistent sampling module dynamically set according to the degree of the first 30% confidence, then, fitting is performed in parallel by using a minimum median method, the optimal fitting result is retained by using the bidirectional reprojection error of the interior point, and a correct fitting model with a self-adaptive discriminant formula is designed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210856359.3A CN115115860A (en) | 2022-07-20 | 2022-07-20 | Image feature point detection matching network based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210856359.3A CN115115860A (en) | 2022-07-20 | 2022-07-20 | Image feature point detection matching network based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115115860A true CN115115860A (en) | 2022-09-27 |
Family
ID=83333695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210856359.3A Pending CN115115860A (en) | 2022-07-20 | 2022-07-20 | Image feature point detection matching network based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115115860A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664643A (en) * | 2023-06-28 | 2023-08-29 | 哈尔滨市科佳通用机电股份有限公司 | Railway train image registration method and equipment based on SuperPoint algorithm |
-
2022
- 2022-07-20 CN CN202210856359.3A patent/CN115115860A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116664643A (en) * | 2023-06-28 | 2023-08-29 | 哈尔滨市科佳通用机电股份有限公司 | Railway train image registration method and equipment based on SuperPoint algorithm |
CN116664643B (en) * | 2023-06-28 | 2024-08-13 | 哈尔滨市科佳通用机电股份有限公司 | Railway train image registration method and equipment based on SuperPoint algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738697B (en) | Monocular depth estimation method based on deep learning | |
Jiang et al. | Edge-enhanced GAN for remote sensing image superresolution | |
CN110335290B (en) | Twin candidate region generation network target tracking method based on attention mechanism | |
CN110533721B (en) | Indoor target object 6D attitude estimation method based on enhanced self-encoder | |
CN111612807B (en) | Small target image segmentation method based on scale and edge information | |
CN108416266B (en) | Method for rapidly identifying video behaviors by extracting moving object through optical flow | |
CN108921926B (en) | End-to-end three-dimensional face reconstruction method based on single image | |
CN110136062B (en) | Super-resolution reconstruction method combining semantic segmentation | |
CN110070091B (en) | Semantic segmentation method and system based on dynamic interpolation reconstruction and used for street view understanding | |
CN114187450A (en) | Remote sensing image semantic segmentation method based on deep learning | |
CN112560865B (en) | Semantic segmentation method for point cloud under outdoor large scene | |
CN114463492B (en) | Self-adaptive channel attention three-dimensional reconstruction method based on deep learning | |
CN111899295B (en) | Monocular scene depth prediction method based on deep learning | |
CN115082293A (en) | Image registration method based on Swin transducer and CNN double-branch coupling | |
CN115359372A (en) | Unmanned aerial vehicle video moving object detection method based on optical flow network | |
CN113538527B (en) | Efficient lightweight optical flow estimation method, storage medium and device | |
CN114724155A (en) | Scene text detection method, system and equipment based on deep convolutional neural network | |
CN116563682A (en) | Attention scheme and strip convolution semantic line detection method based on depth Hough network | |
CN114677479A (en) | Natural landscape multi-view three-dimensional reconstruction method based on deep learning | |
CN113436237A (en) | High-efficient measurement system of complicated curved surface based on gaussian process migration learning | |
CN112581423A (en) | Neural network-based rapid detection method for automobile surface defects | |
CN115546273A (en) | Scene structure depth estimation method for indoor fisheye image | |
CN115115860A (en) | Image feature point detection matching network based on deep learning | |
CN117975469A (en) | Document image shape correction method and system based on deep learning | |
CN117593187A (en) | Remote sensing image super-resolution reconstruction method based on meta-learning and transducer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |