Unsupervised Trademark Retrieval Method Based on Attention Mechanism
<p>Schematic diagram of attention mechanism in trademark image.</p> "> Figure 2
<p>Unsupervised trademark retrieval method with embedded attention module.</p> "> Figure 3
<p>ECA (Efficient Channel Attention) module structure.</p> "> Figure 4
<p>Unsupervised feature leaning approach.</p> "> Figure 5
<p>Some images from three groups in the query set.</p> "> Figure 6
<p>The performance of trademark retrieval results under different <span class="html-italic">k</span> values. NAR: Normalized Average Rank; MSE: Mean Squared Error.</p> "> Figure 7
<p>Visualization effect of Convolutional Neural Network (CNN) on trademark image.</p> "> Figure 8
<p>Comparison of trademarks retrieved by ResNet50, SENet, and ECANet.</p> ">
:1. Introduction
2. Related Work
2.1. Unsupervised Learning
2.2. Attention Mechanism
3. The Proposed Method
3.1. Learning about Important Features of Trademarks
3.2. Instance Discrimination
- Select training samples from the trademark database and preprocess them to obtain , form training batches;
- Input the training set into the unsupervised network, extract the features to get the initial feature set , and store it as the corresponding feature of the current batch;
- Sample negative samples from the stored feature set s;
- Calculate the loss value of the instance sample and the noise sample collected from the memory bank;
- Use back propagation to continuously optimize the target value and update the parameters until the end of the training.
3.3. Similarity Measure
3.4. The Process of Our Proposed Method
Algorithm1: Unsupervised trademark retrieval method based on attention mechanism |
Input: Retrieved image I, Trademark database M. Output: Image sequence R which is similar to I. Step1: for i←1 to maximum_epochs do 1. . 2. , put V into the instance discrimination module. 3. and optimize loss, update V iteratively. 4. Backpropagate the loss and update the parameters. 5. Repeat the above steps until the algorithm converges to get the feature extraction network N. end for step2: 1. in the retrieval module. 2. in the retrieval module. 3. , output similar image sequence R. |
4. Experiment
4.1. METU Dataset
4.2. Evaluation Method and Metrics
4.3. Experimental Settings
4.3.1. Training Parameters
4.3.2. Effect of k on ECA Module
4.4. Experimental Results and Analysis
4.4.1. Compared with Traditional Feature Extraction Methods
4.4.2. Compared with Deep Learning Methods
4.4.3. Visualization of the Results
5. Conclusions
Author Contributions
Method | NAR ± MSE |
CH 1 | 0.400 ± 0.175 |
LBP 2 | 0.276 ± 0.142 |
GIST 3 | 0.254 ± 0.173 |
SC 4 | 0.220 ± 0.186 |
HOG 5 | 0.262 ± 0.129 |
SIFT 6 | 0.179 ± 0.145 |
OR-SIFT 7 | 0.190 ± 0.151 |
SURF 8 | 0.207 ± 0.151 |
Our Method | 0.051 ± 0.002 |
Method | NAR ± MSE |
ResNet50 (FC1000) | 0.110 ± 0.133 |
ResNet50 (Pool5) | 0.095 ± 0.138 |
VGGNet16 (FC7) | 0.086 ± 0.107 |
AlexNet (FC7) | 0.112 ± 0.171 |
GoogleNet (77S1) | 0.118 ± 0.138 |
VGG19v | 0.066 ± 0.130 |
VGG19c | 0.063 ± 0.128 |
VGG19v + VGG19c | 0.047 ± 0.095 |
SENet | 0.056 ± 0.003 |
SENet (ResNeXt) | 0.055 ± 0.008 |
SKNet | 0.068 ± 0.002 |
CBAM | 0.056 ± 0.003 |
ResNet50 (dim = 128) | 0.063 ± 0.002 |
Our Method | 0.051 ± 0.002 |
Score Index | Pic1 | Pic2 | Pic3 | Pic4 |
US_1 | 0.837 | 0.802 | 0.881 | 0.894 |
US_2 | 0.821 | 0.744 | 0.824 | 0.731 |
US_3 | 0.692 | 0.673 | 0.803 | 0.625 |
US_4 | 0.667 | 0.661 | 0.752 | 0.612 |
US_5 | 0.655 | 0.606 | 0.670 | 0.580 |
RES_1 | 0.860 | 0.712 | 0.778 | 0.807 |
RES_2 | 0.734 | 0.654 | 0.773 | 0.579 |
RES_3 | 0.667 | 0.617 | 0.767 | 0.497 |
RES_4 | 0.605 | 0.560 | 0.694 | 0.426 |
RES_5 | 0.570 | 0.553 | 0.545 | 0.415 |
US_AVG | 0.734 | 0.697 | 0.786 | 0.688 |
RES_AVG | 0.687 | 0.619 | 0.711 | 0.545 |
