Nothing Special   »   [go: up one dir, main page]

CN113032613B - Three-dimensional model retrieval method based on interactive attention convolution neural network - Google Patents

Three-dimensional model retrieval method based on interactive attention convolution neural network Download PDF

Info

Publication number
CN113032613B
CN113032613B CN202110270518.7A CN202110270518A CN113032613B CN 113032613 B CN113032613 B CN 113032613B CN 202110270518 A CN202110270518 A CN 202110270518A CN 113032613 B CN113032613 B CN 113032613B
Authority
CN
China
Prior art keywords
model
neural network
view
sketch
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110270518.7A
Other languages
Chinese (zh)
Other versions
CN113032613A (en
Inventor
贾雯惠
高雪瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202110270518.7A priority Critical patent/CN113032613B/en
Publication of CN113032613A publication Critical patent/CN113032613A/en
Application granted granted Critical
Publication of CN113032613B publication Critical patent/CN113032613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network. The method comprises the steps of preprocessing a three-dimensional model, fixing a projection angle to obtain 6 views of the three-dimensional model, and converting the views into line graphs to serve as a view set of the three-dimensional model. Secondly, an interaction attention module is embedded in the convolutional neural network to extract semantic features, and data interaction between two network layers of the convolutional neural network is increased. And extracting global features by using a Gist algorithm and a two-dimensional shape distribution algorithm. Thirdly, calculating the similarity between the sketch and the two-dimensional view by using the Euclidean distance. These features are then combined with the weights to retrieve the three-dimensional model. The method solves the problem of inaccurate semantic features caused by overfitting when a neural network is trained by using small sample data, and improves the accuracy of three-dimensional model retrieval.

Description

Three-dimensional model retrieval method based on interactive attention convolution neural network
The technical field is as follows:
the invention relates to a three-dimensional model retrieval method based on an interactive attention convolution neural network, which is well applied to the field of three-dimensional model retrieval.
The background art comprises the following steps:
in recent years, with the increasing development of science and technology, three-dimensional models have important roles not only in many professional fields but also widely spread in daily life of people, and the demand of people for searching three-dimensional models is gradually increased. The test objects of the example-based three-dimensional model retrieval can only be models in a database, so that certain universality is lacked. The three-dimensional model retrieval based on the sketch can be drawn at will according to the requirements of users, is convenient and applicable, and has wide prospects.
Currently, some common algorithms use a single manual feature or deep-learning algorithm pair to solve the sketch-based model retrieval problem. However, the traditional manual features have defects, researchers need a large amount of prior knowledge, the setting of parameters needs to be manually set in advance, and the extracted feature effect is not imagined. The parameters can be automatically adjusted by using a deep learning algorithm, so that the method has good expansibility. It also has certain drawbacks. Because the number of nodes of the deep neural network is large, a large amount of data is needed to train the neural network to obtain excellent results, and once the training data amount is insufficient, overfitting is caused, and the obtained results are also deviated. In order to obtain a better retrieval result on the premise of insufficient training samples, the invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network.
The invention content is as follows:
the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network, aiming at solving the problem that the retrieval effect of a three-dimensional model retrieval method based on a sketch is poor on the premise of insufficient training samples.
Therefore, the invention provides the following technical scheme:
1. a three-dimensional model retrieval method based on an interactive attention convolution neural network is characterized by comprising the following steps:
step 1: and carrying out data preprocessing, projecting the three-dimensional model to obtain a plurality of views corresponding to the three-dimensional model, and obtaining an edge view set of the model by using an edge detection algorithm.
Step 2: designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module. And selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
And 3, step 3: the training includes two processes, forward propagation and backward propagation. Training data are used as input of interactive attention convolution neural network model training, and the optimized interactive attention convolution neural network model is obtained through the training of the interactive attention convolution neural network model.
And 4, step 4: and respectively extracting semantic features of the freehand sketch and the model view by using the optimized interactive attention convolution neural network model and the gist feature, and respectively extracting two-dimensional shape distribution features of the freehand sketch and the model view by using the two-dimensional shape distribution feature.
And 5: and fusing the plurality of features in a weighting mode. And retrieving the model which is most similar to the hand-drawn sketch according to the Euclidean distance.
2. The method for retrieving a three-dimensional model based on an interactive attention convolutional neural network as claimed in claim 1, wherein in the step 1, the three-dimensional model is projected to obtain a plurality of views corresponding to the three-dimensional model and an edge detection algorithm is used to obtain an edge view set of the model, and the specific steps are as follows:
step 1-1, setting a three-dimensional model at the center of a virtual sphere;
step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees by 30 degrees in each step to obtain 12 view sets of the three-dimensional model;
1-3, obtaining respective edge views of 12 original view sets by using a Canny edge detection algorithm;
after the three-dimensional model is projected, the three-dimensional model is characterized into a group of two-dimensional views, and the semantic gap between the hand-drawn sketch and the three-dimensional model view can be reduced by using a Canny edge detection algorithm.
3. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 2, the deep convolutional neural network is designed, and the network model is optimized by using the interactive attention module, and the specific steps are as follows:
step 2-1, determining the depth of a convolutional neural network, the size of a convolutional kernel, and the number of convolutional layers and pooling layers;
step 2-2, designing an interactive attention module, connecting a global pooling layer after the output of the convolutional layer, and solving the conv of the convolutional layer n Amount of information Z in each channel k The information amount calculation formula is as follows:
Figure BDA0002974166900000031
wherein, conv nk A kth feature map of size W representing the output of the nth convolutional layer n *H n
Step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantity kn The weight is calculated as follows:
S kn =F ex (Z,W)=σ(g(Z,W))=σ(W 2 δ(W 1 Z))
wherein, delta is a Relu function, and sigma is a sigmoid function. W is a group of 1 、W 2 The weights of the first full connection and the second full connection, respectively.
Step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectively k1 And S k2 And fusing the data to obtain the optimal attention weight S k The calculation formula of the optimal attention weight is as follows:
S k =Average(S k1 ,S k2 )
step 2-5 will notice the weight S k And second convolution layer conv 2 The first pooling layer a p Fusing to obtain final result a 2 The fused calculation formula is as follows:
Figure BDA0002974166900000032
and selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
4. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 3, the convolutional neural network model is trained, and the specific steps are as follows:
step 3-1, inputting training data into an initialized interactive attention convolution neural network model;
step 3-2, extracting more detailed view features through the convolution layer, extracting low-level features through the shallow-level convolution layer, and extracting high-level semantic features through the high-level convolution layer;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, reducing information lost when the edge view of the hand-drawn sketch or the model is pooled;
step 3-4, the scale of the view features is reduced through a pooling layer, so that the number of parameters is reduced, and the speed of model calculation is increased;
step 3-5, through a Dropout layer, the overfitting problem caused by insufficient training samples is relieved;
3-6, after alternately operating convolution, an attention module, dropout and pooling, finally inputting a full connection layer, and reducing the dimensions of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
steps 3-7 use the labeled 2D view to optimize the weights and biases of the interactive attention convolution neural network during back propagation. The 2D view-set is { v } 1 ,v 2 ,…,v n Is set of { l } labels 1 ,l 2 ,…,l n }. The 2D view has t classes including 1,2, \8230;, t. After forward propagation, v i The prediction probability in class j is y _ testj. V is to be i Label l of i Comparing with the category j, calculating the expected probability y ij The formula for calculating the probability is as follows:
Figure BDA0002974166900000041
step 3-8 will predict the probability y _ test ij And true probability y j A comparison is made and the error loss is calculated using a cross entropy loss function.
The error loss is calculated as follows:
Figure BDA0002974166900000051
and continuously iterating the interactive attention convolution neural network model to obtain an optimized interactive attention convolution neural network model, and storing the weight and the bias.
5. The method for retrieving the three-dimensional model based on the interactive attention convolution neural network as claimed in claim 1, wherein in the step 4, the optimized interactive attention convolution neural network model and the gist feature are used to extract semantic features of the freehand sketch and the model view respectively, and the two-dimensional shape distribution feature of the freehand sketch and the model view is extracted respectively by using the two-dimensional shape distribution feature, and the specific process is as follows:
step 4-1, inputting the test data into the optimized interactive attention convolution neural network model;
and 4-2, extracting the characteristics of the full connection layer to be used as high-level semantic characteristics of the hand-drawn sketch or the model view.
Step 4-3 divides the sketch or 2D view of size m x n into 4 x 4 blocks. The size of each block is a b, where a = m/4, b = n/4.
Step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The formula is as follows:
Figure BDA0002974166900000052
wherein i =4,j =8.G (x, y) is the gist feature of the 32 Gabor filters, and cat () represents the stitching operation. Here, x and y are positions of pixels, and I (x, y) denotes a block. At the same time, g ij (x, y) are filters for the ith scale and the jth direction. * Representing a convolution operation.
Step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the 2D view, and collecting the points as points = { (x) 1 ,y 1 ),…,(x i ,y i ),…,(x n ,y n ) }. Here (x) i ,y i ) Are the coordinates of the points.
Steps 4-6 represent the distance between the centroid and the random sample point on the sketch or two-dimensional view boundary using the D1 descriptor. Extracting dots from the dots, and collecting PD1= { ai = { 1 ,…,ai k ,…,ai N }. D1 set of shape distribution features as { D1_ v 1 ,…,D1_v i ,…,D1_v Bins }. Wherein D1_ v i Is a statistic of intervals (BinsSize × (i-1), binsSize ×, i), bins is the number of intervals, and BinsSize is the length of the intervals. D1_ v i The calculation formula of (c) is as follows:
D1_v i =|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|
where BinsSize = max ({ dist (P, O) | P ∈ PD1 })/N, dist () is the euclidean distance between two points. O is the centroid of the sketch or 2D view.
Steps 4-7 describe the distance between two random sample points on the sketch or two-dimensional view boundary using a D2 descriptor. Extracting point pairs from the points and collecting the point pairs as PD2= { (ai) 1 ,bi 1 ),(ai 2 ,bi 2 ),…,(ai N ,bi N ) }. D2 set of shape distribution features as { D2_ v } 1 ,…,D2_v i ,…,D2_v Bins }. Here, D2_ v i Statistics in the intervals (BinSize × (i-1), binSize × (i)) are represented. D2_ v i The calculation formula is as follows:
D2_v i =|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|
where binsseze = max ({ dist (P) | P ∈ PD2 })/N.
Steps 4-8 utilize the D3 descriptor for describing the square root of the area formed by the 3 random sample points on the sketch or 2D view boundary. Extracting point triplets from the points, and collecting PD3= { (ai) 1 ,bi 1 ,ci 1 ),(ai 2 ,bi 2 ,ci 2 ),…,(ai n ,bi n ,ci n ) }. D3 set of shape distribution features as { D3_ v 1 ,…,D3_v i ,…,D3_v Bins }. Here, D3_ v i The statistical information in the interval (BinSize × (i-1), binSize × i) is shown. D3_ v i The calculation formula is as follows:
D3_v i =|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|
wherein,
Figure BDA0002974166900000061
herson () stands for Helen's formula, and calculates triangle P = (P) using Helen's formula 1 ,P 2 ,P 3 ) The calculation formula is as follows:
Figure BDA0002974166900000062
Figure BDA0002974166900000063
wherein, a = dist (P) 1 ,P 2 ),b=dist(P 1 ,P 3 ),c=dist(P 2 ,P 3 ).
Step 4-9D1_v i ,D2_v i ,D3_v i The connection forms a shape distribution feature, i =1,2, \ 8230;, bins.
6. The three-dimensional model retrieval method based on interactive attention CNN and weighted similarity calculation of claim 1, wherein in the step 5, a plurality of features are fused, and a model most similar to a hand-drawn sketch is retrieved according to a similarity measurement formula by the following specific processes:
step 5-1, selecting Euclidean distance as a similarity measurement method;
and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance1, and calculating retrieval accuracy, and marking as t1;
and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance2, and calculating the accuracy of retrieval, and marking as t2;
and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution features, and performing normalization processing on the feature vector. Calculating similarity by using Euclidean distance, marking as distance3, and calculating retrieval accuracy, and marking as t3;
and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form a new feature similarity Sim (distance). The formula is as follows:
Sim(distance)=w 1 *distance1+w 2 *distance2+w 3 *distance,w 1 +w 2 +w 3 =1
wherein w 1 =t 1 /(t 1 +t 2 +t 3 ),w 2 =t 2 /(t 1 +t 2 +t 3 ),w 3 =t 3 /(t 1 +t 2 +t 3 )
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
Has the advantages that:
1. the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network. Model search was performed based on the SHREC13 database and the model net40 database. Experimental results show that the method has high accuracy.
2. The retrieval model used by the invention is an interactive attention module and a convolutional neural network model, and the convolutional neural network has the capacity of local perception and parameter sharing, can well process high-dimensional data and does not need to manually select data characteristics. The proposed interactive attention model combines the attention weights of two adjacent convolutional layers to realize the interaction of data between two network layers. And a better retrieval effect can be obtained by the trained convolutional neural network model.
3. And when the model is trained, updating parameters by adopting a random gradient descent method. The error returns along the original route through reverse propagation, namely, each layer of parameters are updated layer by layer from the output layer through each middle hidden layer in the reverse direction, and finally the error returns to the output layer. Forward and backward propagation are continuously performed to reduce errors and update model parameters until the CNN is trained.
4. The invention improves the distribution characteristics of the three-dimensional shape, so that the method is suitable for sketch and two-dimensional view. Shape information of the sketch and the three-dimensional model view is described using a shape distribution function.
5. The invention adopts a mode of self-adaptive fusion of various characteristics to perform similarity fusion on the provided characteristics, thereby realizing better retrieval effect.
Description of the drawings:
fig. 1 is a sketch to be retrieved in an embodiment of the present invention.
Fig. 2 is a three-dimensional model search framework diagram according to an embodiment of the present invention.
FIG. 3 is a projection view of a model in an embodiment of the invention.
Fig. 4 is a Canny edge view in an embodiment of the invention.
FIG. 5 is a model of an interactive attention convolutional neural network in an embodiment of the present invention.
FIG. 6 is a training process of the interactive attention convolution neural network in an embodiment of the present invention.
FIG. 7 illustrates a testing process of the Interactive attention convolutional neural network in an embodiment of the present invention.
The specific implementation mode is as follows:
in order to clearly and completely describe the technical solutions in the embodiments of the present invention, the present invention is further described in detail below with reference to the drawings in the embodiments.
The invention uses the sketch of the SHREC13 and the data of the model Net40 model base to carry out experimental verification. Take "17205.Png" in the SHREC13 sketch and "table _0399.Off" in the model library of ModelNet40 as examples. The sketch to be retrieved is shown in fig. 1.
The experimental frame diagram of the three-dimensional model retrieval method based on the interactive attention convolution neural network is implemented, as shown in fig. 2, and comprises the following steps:
step 1, projecting the three-dimensional model to obtain a three-dimensional model edge view set, which specifically comprises the following steps:
step 1-1, a table _0399.Off file is placed in the center of a virtual sphere.
Step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees at each step by 30 degrees, so as to obtain 12 view sets of the three-dimensional model, wherein one view is taken as an example for display, and the projection view of the model is shown in fig. 3;
the views obtained by steps 1-3 using the Canny edge detection algorithm are shown in fig. 4;
step 2, designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module, as shown in fig. 5, specifically:
and 2-1, designing a deep convolutional neural network for better characteristic extraction effect, wherein the deep convolutional neural network comprises 5 convolutional layers, 4 pooling layers, two dropout layers, a connecting layer and a full connecting layer.
Step 2-2, embedding the interactive attention module into the designed convolutional neural network structure, connecting a global pooling layer after the output of the convolutional layer, and solving the information quantity Z of each channel in the convolutional layer k . Taking the sketch as an example, the first convolution layer information amount of the sketch is as follows:
Z k =[[0.0323739 0.04996519 0.0190248 0.03274497 0.03221277 0.00206719 0.04075038 0.01613641 0.03390235 0.04024649 0.03553107 0.00632962 0.03442683 0.04588291 0.01900478 0.02144121 0.03710039 0.03861086 0.05596253 0.0439686 0.03611921 0.04850776 0.00716817 0.02596463 0.00525256 0.03657651 0.02809189 0.03490375 0.04528182 0.03938764 0.00690786 0.04449471]]
step 2-3, two full connection layers are connected after the global pooling layer, and the attention weight S of each channel is self-adaptively adjusted according to the information quantity kn . Taking the sketch as an example, the attention weights of the sketch are as follows:
S kn =[[0.49450904 0.49921992 0.50748134 0.5051483 0.5093386 0.49844238 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49788538 0.505669 0.5012219 0.5009724 0.4942028 0.49796405 0.4992011 0.5064934 0.4963113 0.50500274 0.50238824 0.50202376 0.49661288 0.50185806 0.5048757 0.5073203 0.50703263 0.51684725 0.50641936 0.5052296 0.4979179]]
step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectively k1 And S k2 And fusing the data to obtain the optimal attention weight S k The optimal attention weight of the sketch is as follows:
S k =[[0.4625304 0.47821882 0.5064253 0.5032532 0.5093386 0.49877496 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49784237 0.505688 0.5011142 0.5008647 0.4942028 0.49796405 0.4991069 0.5064934 0.4963113 0.5102687 0.50125698 0.502524856 0.49675384 0.49365704 0.5027958 0.5076529 0.50814523 0.51006527 0.50361942 0.50422731 0.4635842]]
step 2-5 will notice the weight S k And second convolution layer conv 2 The first pooling layer a p Fusing to obtain final result a 2 Partial results for the second convolution layer of the sketch are:
a 2 =[[[[0.14450312 0.0644969 0.10812703...0.18608719 0.01994037 0]
[0.18341058 0.15881275 0.24716881...0.18875208 0.14420813 0.08290599]
[0.17390229 0.14937611 0.2255666...0.15295741 0.18792515 0.08066748]
...
[0.31344187 0.18656467 0.22178406...0.22087486 0.22130579 0.00955889]
[0.12405898 0.10548315 0.11685486...0.10439464 0.2906406 0.14846338]]
[[0.10032222 0.21919143 0.09797319...0.13584027 0.0.12112971]
[0.20946684 0.14252397 0.17954415...0.09708451 0.0.15463363]
[0.06941956 0.03963253 0.13273408...0.00173131 0.04566149 0.14895247]
...
[[0.01296724 0.27460644 0.09022377...0.06938899 0.04487894 0.2567152]
[0.16118288 0.38024116 0.02033611...0.13374138 0 0.17068687]
[0.09430372 0.35878736 0...0.0846955 0 0.25289127]
...
[0.10363265 0.4103881 0...0.0728834 0 0.29586816]
[0.18578637 0.34666267 0...0.05323519 0 0.27042198]
[0.0096841 0.18718664 0...0.04646093 0.00576336 0.155898]]]]
step 3, training the convolutional neural network model, as shown in fig. 6, specifically comprising the following steps:
step 3-1, inputting the sketch and the edge two-dimensional view into an initialized interactive attention convolution neural network as training data;
step 3-2, extracting more detailed view characteristics through the convolution layer;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, the information lost when the edge view of the hand-drawn sketch or the model is pooled can be reduced;
step 3-4, extracting the maximum view information through a pooling layer;
step 3-5, passing through a Dropout layer, reducing the overfitting problem caused by insufficient training samples;
3-6, after alternately operating convolution, an attention module, dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
steps 3-7 learned by the softmax function that the probability of sketch "17205.Png" under the "table" category is 89.99%
Step 3-8 is to predict the probability y _ test ij And true probability y j A comparison is made and the error loss is calculated using the cross entropy loss function.
Figure BDA0002974166900000121
Wherein loss 17205 Error of draft "17205.Png" is shown.
And continuously iterating the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.
Step 4, extracting semantic features and shape distribution features, specifically:
step 4-1, inputting the test data into the optimized interactive attention convolution neural network model, wherein the test process is shown in FIG. 7;
and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view. Part of the high-level semantic features of the extracted sketch are as follows:
Feature=[[0,0.87328064,0,0,1.3293583,0,2.3825126,0,0,4.8035927,0,1.5186063,0,3.6845286,1.0825952,0,1.8516512,1.0285587,0,0,0,3.3322043,1.0545557,0,0,4.8707848,3.042554,0,0,0,0,6.8227463,2.537525,1.5318785,2.7271123,0,3.0482264……]]
step 4-3 dividing the size sketch or two-dimensional view into 4 x 4 blocks;
step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The Gist feature is extracted 512 dimensions, and the partial Gist feature of the sketch is as follows:
G(x,y)=[[5.81147151e-03 1.51588341e-02 1.75721212e-03 2.10059434e-01 1.62918585e-01 1.54040498e-01 1.44374291e-01 8.71880878e-01 5.26758657e-01 4.14263371e-01 7.17606844e-01 6.22190594e-01 1.11205845e-01 7.69002490e-04 2.18182730e-01 2.29565939e-01 9.32599080e-03 1.10805327e-02 1.40071468e-03 2.58543039e-01 5.67934220e-02 1.06132064e-01 9.10082146e-02 4.02163211e-01 2.97883778e-01 2.45860956e-01 4.02066928e-01 2.84401506e-01
1.03228724e-01 6.37419945e-04 2.71290458e-01……]]
step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the two-dimensional view;
and 4-6, representing the distance between the centroid and the random sampling point on the boundary of the sketch or the two-dimensional view by using the D1 descriptor. The portion D1 descriptor of the sketch is as follows:
D1=[0.30470497858541628,0.6256941275550102,0.11237884569183111,0.23229854666522,0.2657159486944761,0.0731852015843772,0.40751749800795261……]
steps 4-7 use the D2 descriptor to describe the distance between two random sample points on the sketch or two-dimensional view boundary. The portion D2 descriptor of the sketch is as follows:
D2=[0.13203683803844625,0.028174099301372796,0.15392681513105217,0.130238265264,0.123460163767958,0.06985106421513015,0.12992235205980568……]
steps 4-8 utilize the D3 descriptor for describing the square root of the area formed by the 3 random sample points on the sketch or two-dimensional view boundary. The portion D3 descriptor of the sketch is as follows:
D3=[0.9193157274532394,0.5816923854309814,0.46980644879802125,0.498873567635874,0.7195175116705602,0.29425190983247506,0.8724092377243926……]
step 4-9, connecting D1, D2 and D3 in series to form a shape distribution characteristic;
and 5, fusing a plurality of characteristics of the sketch, and retrieving a model most similar to the hand-drawn sketch according to a similarity measurement formula, wherein the method specifically comprises the following steps:
step 5-1, comparing various similarity retrieval methods, wherein the final effect is best in Euclidean distance;
and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance1, and the retrieval accuracy rate is 0.96;
and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance2, and the retrieval accuracy rate is 0.53;
and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution features, and performing normalization processing on the feature vector. Calculating the similarity by using the Euclidean distance, marking as distance3, and the retrieval accuracy rate is 0.42;
and 5-5, determining the weight according to the retrieval accuracy of the three characteristics.
Figure BDA0002974166900000151
Figure BDA0002974166900000152
Figure BDA0002974166900000153
The final weight is determined as: 5:3:2
Sim(distance)=0.5*distance1+0.3*distance2+0.2*distance
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
According to the three-dimensional model retrieval method based on the interactive attention convolution neural network, a traditional characteristic and depth characteristic weighting fusion mode is adopted, and a good retrieval effect is achieved.
The foregoing is a detailed description of embodiments of the invention, taken in conjunction with the accompanying drawings, wherein the specific embodiments are merely provided to assist in understanding the method of the invention. For those skilled in the art, variations and modifications can be made within the scope of the embodiments and applications according to the concept of the present invention, and therefore the present invention should not be construed as being limited thereto.

Claims (5)

1. A three-dimensional model retrieval method based on an interactive attention convolution neural network is characterized by comprising the following steps:
step 1: carrying out data preprocessing, projecting the three-dimensional model to obtain a plurality of views corresponding to the three-dimensional model and obtaining an edge view set of the model by using an edge detection algorithm;
step 2: designing a deep convolutional neural network, optimizing a network model by using an interactive attention module, selecting one part of view sets as a training set, and selecting the other part of view sets as a test set, wherein the method comprises the following steps:
step 2-1, determining the depth of a convolutional neural network, the size of a convolutional kernel, and the number of convolutional layers and pooling layers;
step 2-2, designing an interactive attention module, connecting a global pooling layer after the output of the convolutional layer, and solving the conv of the convolutional layer n Amount of information Z in each channel k The information amount calculation formula is as follows:
Figure FDA0003751237570000011
wherein, conv nk A kth feature map of size H representing the output of the nth convolutional layer n *W n
Step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantity kn The weight is calculated as follows:
S kn =F ex (Z,W)=σ(g(Z,W))=σ(W 2 δ(W 1 Z));
wherein, delta is Relu function, sigma is sigmoid function, W 1 、W 2 Weights for the first full connection and the second full connection, respectively;
step 2-4, respectively calculating the interactive attention weight S of the two neighborhood convolution layers k1 And S k2 And fusing the two to obtain the optimal attention weight S k The calculation formula of the optimal attention weight is as follows:
S k =Average(S k1 ,S k2 );
step 2-5, attention will be paid to the weight S k And second convolution layer conv 2 The first pooling layer a p Fusing to obtain final result a 2 The fused calculation formula is as follows:
Figure FDA0003751237570000012
selecting one part of view set as a training set and the other part of view set as a test set;
and 3, step 3: training comprises a forward propagation process and a backward propagation process, training data are used as input of interactive attention convolution neural network model training, and an optimized interactive attention convolution neural network model is obtained through the training of the interactive attention convolution neural network model;
and 4, step 4: extracting semantic features of the freehand sketch and the model view respectively by using the optimized interactive attention convolution neural network model and gist features, and extracting two-dimensional shape distribution features of the freehand sketch and the model view respectively by using two-dimensional shape distribution features;
and 5: and (4) weighting and fusing the plurality of features, and retrieving the model which is most similar to the hand-drawn sketch according to the Euclidean distance.
2. The method for retrieving a three-dimensional model based on an interactive attention convolutional neural network as claimed in claim 1, wherein in the step 1, the three-dimensional model is projected to obtain a plurality of views corresponding to the three-dimensional model and an edge detection algorithm is used to obtain an edge view set of the model, and the specific steps are as follows:
step 1-1, arranging a three-dimensional model in the center of a virtual sphere;
step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees by 30 degrees in each step to obtain 12 view sets of the three-dimensional model;
1-3, obtaining respective edge views of 12 original view sets by using a Canny edge detection algorithm;
after the three-dimensional model is projected, the three-dimensional model is characterized into a group of two-dimensional views, and the semantic gap between the hand-drawn sketch and the three-dimensional model view can be reduced by using a Canny edge detection algorithm.
3. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 3, the convolutional neural network model is trained, and the specific steps are as follows:
step 3-1, inputting training data into an initialized interactive attention convolution neural network model;
step 3-2, extracting more detailed view features through the convolutional layers, extracting low-level features through the shallow-level convolutional layers, and extracting high-level semantic features through the high-level convolutional layers;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, reducing information lost when the edge view of the hand-drawn sketch or the model is pooled;
step 3-4, the scale of the view features is reduced through a pooling layer, so that the number of parameters is reduced, and the speed of model calculation is increased;
step 3-5, through a Dropout layer, the overfitting problem caused by insufficient training samples is relieved;
3-6, after alternately operating convolution, attention module, dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
3-7, in the back propagation process, using the 2D view with the label to optimize the weight and the bias of the interactive attention convolution neural network, wherein the 2D view set is { v } 1 ,v 2 ,…,v n Is set of { l } labels 1 ,l 2 ,…,l n },2D views have t classes, including 1,2, \ 8230;, t, after forward propagation, v i The prediction probability in class j is y _ testj, v is i Label l of i Comparing with the category j, calculating the expected probability y ij The formula for calculating the probability is as follows:
Figure FDA0003751237570000021
step 3-8, predicting the probability y _ test ij And true probability y j Comparing, and calculating an error loss by using a cross entropy loss function;
the error loss is calculated as follows:
Figure FDA0003751237570000031
and continuously iterating the interactive attention convolution neural network model to obtain an optimized interactive attention convolution neural network model, and storing the weight and the bias.
4. The method for retrieving a three-dimensional model based on an interactive attention convolution neural network as claimed in claim 1, wherein in the step 4, the optimized interactive attention convolution neural network model and gist feature are used to extract semantic features of a freehand sketch and a model view respectively, and two-dimensional shape distribution features of the freehand sketch and the model view are extracted respectively by using two-dimensional shape distribution features, and the specific process is as follows:
step 4-1, inputting test data into the optimized interactive attention convolution neural network model;
4-2, extracting the characteristics of the full connection layer to serve as high-level semantic characteristics of the hand-drawn sketch or the model view;
step 4-3, dividing the sketch or the 2D view with the size of m × n into 4 × 4 blocks, wherein the size of each block is a × b, wherein a = m/4, b = n/4;
and 4-4, processing each block by 32 Gabor filters with 4 scales and 8 directions, and combining the processed features to obtain a gist feature, wherein the formula is as follows:
Figure FDA0003751237570000032
where I =4,j =8,g (x, y) is the gist characteristic of 32 Gabor filters, cat () represents the stitching operation, where x and y are the positions of the pixels, I (x, y) represents the block, and g is the same time ij (x, y) is the filter for the ith scale and the jth direction, which represents the convolution operation;
step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the 2D view, and collecting the points as points = { (x) 1 ,y 1 ),…,(x i ,y i ),…,(x n ,y n ) Here (x) i ,y i ) Is the coordinates of the point(s),
step 4-6, representing the distance between the centroid and the random sampling point on the boundary of the sketch or the two-dimensional view by using the D1 descriptor, extracting points from the points, and collecting PD1= { ai = 1 ,…,ai k ,…,ai N }, D1 set of shape distribution features as { D1_ v } 1 ,…,D1_v i ,…,D1_v Bins D1_ vi is a statistic of intervals (BinsSize × (i-1), binsSize ×, i), bins is the number of intervals, binsSize is the length of the intervals, and the calculation formula of D1_ vi is as follows:
D1_v i =|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|;
wherein, binsisze = max ({ dist (P, O) | P ∈ PD1 })/N, dist () is the euclidean distance between two points, and O is the centroid of the sketch or 2D view;
step 4-7, using the D2 descriptor to describe the distance between two random sampling points on the boundary of the sketch or the two-dimensional view, extracting point pairs from the points, and collecting the point pairs as PD2= { (ai) 1 ,bi 1 ),(ai 2 ,bi 2 ),…,(ai N ,bi N ) }, D2 set of shape distribution features as { D2_ v } 1 ,…,D2_v i ,…,D2_v Bins Here, D2_ vi represents a statistic in an interval (BinSize × (i-1), binSize × i), and D2_ vi is calculated as follows:
D2_v i =|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|;
wherein BinsSize = max ({ dist (P) | P ∈ PD2 })/N,
step 4-8, extracting point triples from the points by using the D3 descriptor to describe the square root of the area formed by the 3 random sampling points on the boundary of the sketch or the 2D view, and collecting PD3= { (ai) 1 ,bi 1 ,ci 1 ),(ai 2 ,bi 2 ,ci 2 ),…,(ai n ,bi n ,ci n ) D3 shape distribution feature set to { D3_ v } 1 ,…,D3_v i ,…,D3_v Bins Here, D3_ v i Represents statistical information in the interval (BinSize (i-1), binSize i), D3_ v i Comprises the following steps:
D3_v i =|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|;
wherein,
Figure FDA0003751237570000041
herson () stands for helln formula, and triangle P = (P) is calculated using helln formula 1 ,P 2 ,P 3 ) The calculation formula is as follows:
Figure FDA0003751237570000042
Figure FDA0003751237570000043
wherein, a = dist (P) 1 ,P 2 ),b=dist(P 1 ,P 3 ),c=dist(P 2 ,P 3 );
And 4-9, connecting D1_ vi, D2_ vi and D3_ vi to form a shape distribution characteristic, i =1,2, \ 8230;, bins.
5. The method for retrieving a three-dimensional model based on an interactive attention convolution neural network according to claim 1, wherein in the step 5, a plurality of features are fused, and a model most similar to a hand-drawn sketch is retrieved according to a similarity measurement formula, and the specific process is as follows:
step 5-1, selecting Euclidean distance as a similarity measurement method;
step 5-2, extracting feature vectors from the two-dimensional view and the sketch by using an improved interactive attention convolution neural network, normalizing the feature vectors, calculating similarity by using Euclidean distance, and marking the similarity as distance1, and calculating retrieval accuracy as t1;
step 5-3, extracting feature vectors of the sketch and the model view by using gist features, and carrying out normalization processing on the feature vectors; calculating similarity by using Euclidean distance, marking as distance2, and calculating the accuracy of retrieval, and marking as t2;
step 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, carrying out normalization processing on the feature vector, calculating the similarity by using the Euclidean distance, and marking the similarity as distance3, and calculating the retrieval accuracy as t3;
and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form a new feature similarity distance, wherein the formula is as follows:
Sim(distance)=w 1 *distance1+w 2 *distance2+w 3 *distance,w 1 +w 2 +w 3 =1;
wherein w 1 =t 1 /(t 1 +t 2 +t 3 ),w 2 =t 2 /(t 1 +t 2 +t 3 ),w 3 =t 3 /(t 1 +t 2 +t 3 );
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
CN202110270518.7A 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network Active CN113032613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110270518.7A CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110270518.7A CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Publications (2)

Publication Number Publication Date
CN113032613A CN113032613A (en) 2021-06-25
CN113032613B true CN113032613B (en) 2022-11-08

Family

ID=76470237

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110270518.7A Active CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Country Status (1)

Country Link
CN (1) CN113032613B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113658176B (en) * 2021-09-07 2023-11-07 重庆科技学院 Ceramic tile surface defect detection method based on interaction attention and convolutional neural network
CN114373077B (en) * 2021-12-07 2024-10-29 燕山大学 Sketch recognition method based on double-hierarchy structure
CN114492593B (en) * 2021-12-30 2024-08-16 哈尔滨理工大学 Three-dimensional model classification method based on EFFICIENTNET and convolutional neural network
CN114842287B (en) * 2022-03-25 2022-12-06 中国科学院自动化研究所 Monocular three-dimensional target detection model training method and device of depth-guided deformer
CN117952966B (en) * 2024-03-26 2024-10-22 华南理工大学 Sinkhorn algorithm-based multi-mode fusion survival prediction method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101004748A (en) * 2006-10-27 2007-07-25 北京航空航天大学 Method for searching 3D model based on 2D sketch
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
CN101110826A (en) * 2007-08-22 2008-01-23 张建中 Method, device and system for constructing multi-dimensional address
CN107122396A (en) * 2017-03-13 2017-09-01 西北大学 Three-dimensional model searching algorithm based on depth convolutional neural networks
CN110569386A (en) * 2019-09-16 2019-12-13 哈尔滨理工大学 Three-dimensional model retrieval method based on hand-drawn sketch integrated descriptor
CN111597367A (en) * 2020-05-18 2020-08-28 河北工业大学 Three-dimensional model retrieval method based on view and Hash algorithm

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350016B (en) * 2007-07-20 2010-11-24 富士通株式会社 Device and method for searching three-dimensional model
CN103295025B (en) * 2013-05-03 2016-06-15 南京大学 A kind of automatic selecting method of three-dimensional model optimal view
CN105243137B (en) * 2015-09-30 2018-12-11 华南理工大学 A kind of three-dimensional model search viewpoint selection method based on sketch
JP6798183B2 (en) * 2016-08-04 2020-12-09 株式会社リコー Image analyzer, image analysis method and program
CN109783887A (en) * 2018-12-25 2019-05-21 西安交通大学 A kind of intelligent recognition and search method towards Three-dimension process feature
CN110033023B (en) * 2019-03-11 2021-06-15 北京光年无限科技有限公司 Image data processing method and system based on picture book recognition
CN111078913A (en) * 2019-12-16 2020-04-28 天津运泰科技有限公司 Three-dimensional model retrieval method based on multi-view convolution neural network
CN111242207A (en) * 2020-01-08 2020-06-05 天津大学 Three-dimensional model classification and retrieval method based on visual saliency information sharing
CN111625667A (en) * 2020-05-18 2020-09-04 北京工商大学 Three-dimensional model cross-domain retrieval method and system based on complex background image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
CN101004748A (en) * 2006-10-27 2007-07-25 北京航空航天大学 Method for searching 3D model based on 2D sketch
CN101110826A (en) * 2007-08-22 2008-01-23 张建中 Method, device and system for constructing multi-dimensional address
CN107122396A (en) * 2017-03-13 2017-09-01 西北大学 Three-dimensional model searching algorithm based on depth convolutional neural networks
CN110569386A (en) * 2019-09-16 2019-12-13 哈尔滨理工大学 Three-dimensional model retrieval method based on hand-drawn sketch integrated descriptor
CN111597367A (en) * 2020-05-18 2020-08-28 河北工业大学 Three-dimensional model retrieval method based on view and Hash algorithm

Also Published As

Publication number Publication date
CN113032613A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
CN110598029B (en) Fine-grained image classification method based on attention transfer mechanism
CN106228185B (en) A kind of general image classifying and identifying system neural network based and method
CN112633350B (en) Multi-scale point cloud classification implementation method based on graph convolution
CN112347970B (en) Remote sensing image ground object identification method based on graph convolution neural network
CN110222218B (en) Image retrieval method based on multi-scale NetVLAD and depth hash
CN114841257B (en) Small sample target detection method based on self-supervision comparison constraint
CN107480261A (en) One kind is based on deep learning fine granularity facial image method for quickly retrieving
CN112613552B (en) Convolutional neural network emotion image classification method combined with emotion type attention loss
CN113408605A (en) Hyperspectral image semi-supervised classification method based on small sample learning
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN108052966A (en) Remote sensing images scene based on convolutional neural networks automatically extracts and sorting technique
WO2023019698A1 (en) Hyperspectral image classification method based on rich context network
CN108427740B (en) Image emotion classification and retrieval algorithm based on depth metric learning
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN112364931A (en) Low-sample target detection method based on meta-feature and weight adjustment and network model
CN111125411A (en) Large-scale image retrieval method for deep strong correlation hash learning
CN111506760B (en) Depth integration measurement image retrieval method based on difficult perception
CN114510594A (en) Traditional pattern subgraph retrieval method based on self-attention mechanism
CN112733602B (en) Relation-guided pedestrian attribute identification method
CN110263855A (en) A method of it is projected using cobasis capsule and carries out image classification
CN114926742B (en) Loop detection and optimization method based on second-order attention mechanism
CN116258990A (en) Cross-modal affinity-based small sample reference video target segmentation method
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant