Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Dual Dynamic Threshold Adjustment Strategy

Published: 15 May 2024 Publication History

Abstract

Loss functions and sample mining strategies are essential components in deep metric learning algorithms. However, the existing loss function or mining strategy often necessitates the incorporation of additional hyperparameters, notably the threshold, which defines whether the sample pair is informative. The threshold provides a stable numerical standard for determining whether to retain the pairs. It is a vital parameter to reduce the redundant sample pairs participating in training. Nonetheless, finding the optimal threshold can be a time-consuming endeavor, often requiring extensive grid searches. Because the threshold cannot be dynamically adjusted in the training stage, we should conduct plenty of repeated experiments to determine the threshold. Therefore, we introduce a novel approach for adjusting the thresholds associated with both the loss function and the sample mining strategy. We design a static Asymmetric Sample Mining Strategy (ASMS) and its dynamic version, the Adaptive Tolerance ASMS (AT-ASMS), tailored for sample mining methods. ASMS utilizes differentiated thresholds to address the problems (too few positive pairs and too many redundant negative pairs) caused by only applying a single threshold to filter samples. The AT-ASMS can adaptively regulate the ratio of positive and negative pairs during training according to the ratio of the currently mined positive and negative pairs. This meta-learning-based threshold generation algorithm utilizes a single-step gradient descent to obtain new thresholds. We combine these two threshold adjustment algorithms to form the Dual Dynamic Threshold Adjustment Strategy (DDTAS). Experimental results show that our algorithm achieves competitive performance on the CUB200, Cars196, and SOP datasets. Our codes are available at https://github.com/NUST-Machine-Intelligence-Laboratory/DDTAS.

References

[1]
Jane Bromley, James W. Bentz, Léon Bottou, Isabelle Guyon, Yann LeCun, Cliff Moore, Eduard Säckinger, and Roopak Shah. 1993. Signature verification using a “Siamese” time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence 7, 4 (1993), 669–688.
[2]
Haozhe Chen, Hang Zhou, Jie Zhang, Dongdong Chen, Weiming Zhang, Kejiang Chen, Gang Hua, and Nenghai Yu. 2023. Perceptual hashing of deep convolutional neural networks for model copy detection. ACM Transactions on Multimedia Computing, Communications and Applications 19, 3 (2023), 1–20.
[3]
Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 539–546.
[4]
Thomas Cover and Peter Hart. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 1 (1967), 21–27.
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.
[6]
Y. Dong, L. Zhen, and S. Z. Li. 2014. Deep metric learning for practical person re-identification. Computer Science (2014), 34–39.
[7]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. 1126–1135.
[8]
Yoav Freund and Robert E. Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55, 1 (1997), 119–139.
[9]
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1735–1742.
[10]
Ben Harwood, Vijay Kumar BG, Gustavo Carneiro, Ian Reid, and Tom Drummond. 2017. Smart mining for deep metric learning. In Proceedings of the IEEE International Conference on Computer Vision. 2821–2829.
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[12]
Xinwei He, Yang Zhou, Zhichao Zhou, Song Bai, and Xiang Bai. 2018. Triplet-center loss for multi-view 3D object retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1945–1954.
[13]
Steven C. H. Hoi, Wei Liu, and Shih-Fu Chang. 2010. Semi-supervised distance metric learning for collaborative image retrieval and clustering. ACM Transactions on Multimedia Computing, Communications and Applications 6, 3 (2010), 1–26.
[14]
Junlin Hu, Jiwen Lu, and Yap-Peng Tan. 2014. Discriminative deep metric learning for face verification in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1875–1882.
[15]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. 448–456.
[16]
Wen Jiang, Kai Huang, Jie Geng, and Xinyang Deng. 2020. Multi-scale metric learning for few-shot learning. IEEE Transactions on Circuits and Systems for Video Technology 31, 3 (2020), 1091–1102.
[17]
Xiruo Jiang, Sheng Liu, Xili Dai, Guosheng Hu, Xingguo Huang, Yazhou Yao, Guo-Sen Xie, and Ling Shao. 2022. Deep metric learning based on meta-mining strategy with semiglobal information. IEEE Transactions on Neural Networks and Learning Systems (2022).
[18]
Dae Ha Kim and Byung Cheol Song. 2021. Virtual sample-based deep metric learning using discriminant analysis. Pattern Recognition 110 (2021), 107643.
[19]
Sungyeon Kim, Dongwon Kim, Minsu Cho, and Suha Kwak. 2020. Proxy anchor loss for deep metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3238–3247.
[20]
Wonsik Kim, Bhavya Goyal, Kunal Chawla, Jungmin Lee, and Keunjoo Kwon. 2018. Attention-based ensemble for deep metric learning. In Proceedings of the European Conference on Computer Vision. 736–751.
[21]
Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 2013. 3D object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision. 554–561.
[22]
Chang-Hui Liang, Wan-Lei Zhao, and Run-Qing Chen. 2021. Dynamic sampling for deep metric learning. Pattern Recognition Letters 150 (2021), 49–56.
[23]
Deyin Liu, Lin Wu, Richang Hong, Zongyuan Ge, Jialie Shen, Farid Boussaid, and Mohammed Bennamoun. 2023. Generative metric learning for adversarially robust open-world person re-identification. ACM Transactions on Multimedia Computing, Communications and Applications 19, 1 (2023), 1–19.
[24]
David G. Lowe. 1995. Similarity metric learning for a variable-kernel classifier. Neural Computation 7, 1 (1995), 72–85.
[25]
Junzhu Mao, Yazhou Yao, Zeren Sun, Xingguo Huang, Fumin Shen, and Heng-Tao Shen. 2023. Attention map guided transformer pruning for occluded person re-identification on edge device. IEEE Transactions on Multimedia. (2023).
[26]
Yair Movshovitz-Attias, Alexander Toshev, Thomas K. Leung, Sergey Ioffe, and Saurabh Singh. 2017. No fuss distance metric learning using proxies. In Proceedings of the IEEE International Conference on Computer Vision. 360–368.
[27]
Tsendsuren Munkhdalai and Hong Yu. 2017. Meta networks. In International Conference on Machine Learning. 2554–2563.
[28]
Jiazhi Ni, Jie Liu, Chenxin Zhang, Dan Ye, and Zhirou Ma. 2017. Fine-grained patient similarity measuring using deep metric learning. In Proceedings of the ACM on Conference on Information and Knowledge Management. 1189–1198.
[29]
Alex Nichol, Joshua Achiam, and John Schulman. 2018. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018).
[30]
Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, and Kevin Murphy. 2017. Deep metric learning via facility location. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5382–5390.
[31]
Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. 2016. Deep metric learning via lifted structured feature embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4004–4012.
[32]
Michael Opitz, Georg Waltner, Horst Possegger, and Horst Bischof. 2017. BIER—boosting independent embeddings robustly. In Proceedings of the IEEE International Conference on Computer Vision. 5189–5198.
[33]
Michael Opitz, Georg Waltner, Horst Possegger, and Horst Bischof. 2018. Deep metric learning with BIER: Boosting independent embeddings robustly. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 2 (2018), 276–290.
[34]
Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, and Rong Jin. 2019. SoftTriple loss: Deep metric learning without triplet sampling. In Proceedings of the IEEE International Conference on Computer Vision. 6450–6458.
[35]
Sachin Ravi and Hugo Larochelle. 2017. Optimization as a model for few-shot learning. In International Conference on Learning Representation.
[36]
Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. 2018. Learning to reweight examples for robust deep learning. In International Conference on Machine Learning. 4334–4343.
[37]
Oren Rippel, Manohar Paluri, Piotr Dollar, and Lubomir Bourdev. 2015. Metric learning with adaptive density discrimination. arXiv preprint arXiv:1511.05939 (2015).
[38]
Karsten Roth, Timo Milbich, and Bjorn Ommer. 2020. PADS: Policy-adapted sampling for visual similarity learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6568–6577.
[39]
Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. 2016. Meta-learning with memory-augmented neural networks. In International Conference on Machine Learning. 1842–1850.
[40]
Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85–117.
[41]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815–823.
[42]
Prasen Sharma, Ira Bisht, and Arijit Sur. 2023. Wavelength-based attributed deep neural network for underwater image restoration. ACM Transactions on Multimedia Computing, Communications and Applications 19, 1 (2023), 1–23.
[43]
Mengmeng Sheng, Zeren Sun, Zhenhuang Cai, Tao Chen, Yichao Zhou, and Yazhou Yao. 2024. Adaptive integration of partial label learning and negative learning for enhanced noisy label learning. In Proceedings of the AAAI Conference on Artificial Intelligence. 1–12.
[44]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems. 4077–4087.
[45]
Kihyuk Sohn. 2016. Improved deep metric learning with multi-class n-pair loss objective. In Advances in Neural Information Processing Systems. 1857–1865.
[46]
Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, and Yichen Wei. 2020. Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6398–6407.
[47]
Zeren Sun, Fumin Shen, Dan Huang, Qiong Wang, Xiangbo Shu, Yazhou Yao, and Jinhui Tang. 2022. PNP: Robust learning from noisy labels by probabilistic noise prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5311–5320.
[48]
Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, and Timothy M. Hospedales. 2018. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1199–1208.
[49]
Yin Tang, Tao Chen, Xiruo Jiang, Yazhou Yao, Guo-Sen Xie, and Heng-Tao Shen. 2023. Holistic prototype attention network for few-shot video object segmentation. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[50]
Bhavya Vasudeva, Puneesh Deora, Saumik Bhattacharya, Umapada Pal, and Sukalpa Chanda. 2021. LoOp: Looking for optimal hard negative embeddings for deep metric learning. In Proceedings of the IEEE International Conference on Computer Vision. 10634–10643.
[51]
Vinay Kumar Verma, Dhanajit Brahma, and Piyush Rai. 2020. Meta-learning for generalized zero-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6062–6069.
[52]
Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. 2016. Matching networks for one shot learning. In Advances in Neural Information Processing Systems. 3630–3638.
[53]
Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. 2011. The Caltech-UCSD Birds-200-2011 dataset. Technical Report CNS-TR-2010-001 (2011).
[54]
Jian Wang, Zhichao Zhang, Dongmei Huang, Wei Song, Quanmiao Wei, and XinYue Li. 2021. A ranked similarity loss function with pair weighting for deep metric learning. In IEEE International Conference on Acoustics, Speech and Signal Processing. 1760–1764.
[55]
Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, and Matthew R. Scott. 2019. Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5022–5030.
[56]
Xun Wang, Haozhi Zhang, Weilin Huang, and Matthew R. Scott. 2020. Cross-batch memory for embedding learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6388–6397.
[57]
Kilian Q. Weinberger and Lawrence K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 2 (2009).
[58]
Chao-Yuan Wu, R. Manmatha, Alexander J. Smola, and Philipp Krahenbuhl. 2017. Sampling matters in deep embedding learning. In Proceedings of the IEEE International Conference on Computer Vision. 2840–2848.
[59]
Eric P. Xing, Michael I. Jordan, Stuart J. Russell, and Andrew Y. Ng. 2003. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems. 521–528.
[60]
Xun Yang, Peicheng Zhou, and Meng Wang. 2018. Person reidentification via structural deep metric learning. IEEE Transactions on Neural Networks and Learning Systems 30, 10 (2018), 2987–2998.
[61]
Hantao Yao, Shiliang Zhang, Richang Hong, Yongdong Zhang, Changsheng Xu, and Qi Tian. 2019. Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing. 28, 6 (2019), 2860–2871.
[62]
Yuhui Yuan, Kuiyuan Yang, and Chao Zhang. 2017. Hard-aware deeply cascaded embedding. In Proceedings of the IEEE International Conference on Computer Vision. 814–823.
[63]
Dongyu Zhang, Wenxi Wu, Hui Cheng, Ruimao Zhang, Zhenjiang Dong, and Zhaoquan Cai. 2017. Image-to-video person re-identification with temporally memorized similarity learning. IEEE Transactions on Circuits and Systems for Video Technology 28, 10 (2017), 2622–2632.
[64]
Wenzhao Zheng, Jiwen Lu, and Jie Zhou. 2020. Deep metric learning via adaptive learnable assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2960–2969.
[65]
Guoqiang Zhong, Yan Zheng, Sheng Li, and Yun Fu. 2017. SLMOML: Online metric learning with global convergence. IEEE Transactions on Circuits and Systems for Video Technology 28, 10 (2017), 2460–2472.

Cited By

View all
  • (2024)Universal Organizer of Segment Anything Model for Unsupervised Semantic Segmentation2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687775(1-6)Online publication date: 15-Jul-2024
  • (2024)Selection of disassembly schemes for multiple types of waste mobile phones based on knowledge reuse and disassembly line balancingJournal of Manufacturing Systems10.1016/j.jmsy.2024.07.01376(207-221)Online publication date: Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 7
July 2024
973 pages
EISSN:1551-6865
DOI:10.1145/3613662
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 May 2024
Online AM: 03 April 2024
Accepted: 25 March 2024
Revised: 18 March 2024
Received: 10 October 2023
Published in TOMM Volume 20, Issue 7

Check for updates

Author Tags

  1. Deep metric learning
  2. sample mining strategy
  3. image retrieval

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)152
  • Downloads (Last 6 weeks)19
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Universal Organizer of Segment Anything Model for Unsupervised Semantic Segmentation2024 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME57554.2024.10687775(1-6)Online publication date: 15-Jul-2024
  • (2024)Selection of disassembly schemes for multiple types of waste mobile phones based on knowledge reuse and disassembly line balancingJournal of Manufacturing Systems10.1016/j.jmsy.2024.07.01376(207-221)Online publication date: Oct-2024

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media