Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3503161.3548224acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

Published: 10 October 2022 Publication History

Abstract

Sketch re-identification (Re-ID) refers to using sketches of pedestrians to retrieve their corresponding photos from surveillance videos. It can track pedestrians according to the sketches drawn based on eyewitnesses without querying pedestrian photos. Although the Sketch Re-ID concept has been proposed, the gap between the sketch and the photo still greatly hinders pedestrian identity matching. Based on the idea of transplantation without rejection, we propose a Cross-Compatible Embedding (CCE) approach to narrow the gap. A Semantic Consistent Feature Construction (SCFC) scheme is simultaneously presented to enhance feature discrimination. Under the guidance of identity consistency, the CCE performs cross modal interchange at the local token level in the Transformer framework, enabling the model to extract modal-compatible features. The SCFC improves the representation ability of features by handling the inconsistency of information in the same location of the sketch and the corresponding pedestrian photo. The SCFC scheme divides the local tokens of pedestrian images with different modes into different groups and assigns specific semantic information to each group for constructing a semantic consistent global feature representation. Experiments on the public Sketch Re-ID dataset confirm the effectiveness of the proposed method and its superiority over existing methods. Experiments on Sketch-based image retrieval datasets QMUL-Shoe-v2 and QMUL-Chair-v2 are conducted to assess the method's generalization. The results show that the proposed method outperforms the state-of-the-art works compared. The source code of our method is available at: https://github.com/lhf12278/CCSC.

Supplementary Material

MP4 File (MM22-fp1955.mp4)
Presentation video for ACM MM 2022 paper "Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification"

References

[1]
Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, and Yi-Zhe Song. 2021. More photos are all you need: semi-supervised learning for fine-grained sketch based image retrieval. In CVPR. 4245--4254.
[2]
Ayan Kumar Bhunia, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, and Yi-Zhe Song. 2020. Sketch less for more: on-the-fly fine-grained sketch-based image retrieval. In CVPR. 9779--9788.
[3]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In ECCV. 213--229.
[4]
Chun-Fu (Richard) Chen, Quanfu Fan, and Rameswar Panda. 2021a. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. In ICCV. 347--356.
[5]
Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. 2021b. Pre-trained image processing transformer. In CVPR. 12299--12310.
[6]
Yangdong Chen, Zhaolong Zhang, Yanfei Wang, Yuejie Zhang, Rui Feng, Tao Zhang, and Weiguo Fan. 2022. AE-Net: Fine-grained sketch-based image retrieval via attention-enhanced network. Pattern Recognition, Vol. 122 (2022), 108291.
[7]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248--255.
[8]
Changxing Ding, Kan Wang, Pengfei Wang, and Dacheng Tao. 2022. Multi-task learning with coarse priors for robust part-aware person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 3 (2022), 1474--1488.
[9]
Shaojun Gui, Yu Zhu, Xiangxiang Qin, and Xiaofeng Ling. 2020. Learning multi-level domain invariant features for sketch re-identification. Neurocomputing, Vol. 403 (2020), 294--303.
[10]
Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. 2021. TransReID: transformer-based object re-identification. In ICCV.
[11]
Rui Hu and John Collomosse. 2013. A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Computer Vision and Image Understanding, Vol. 117, 7 (2013), 790--806.
[12]
Huafeng Li, Yiwen Chen, Dapeng Tao, Zhengtao Yu, and Guanqiu Qi. 2021a. Attribute-aligned domain-invariant feature learning for unsupervised domain adaptation person re-identification. IEEE Transactions on Information Forensics and Security, Vol. 16 (2021), 1480--1494.
[13]
Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, and Feng Wu. 2021b. Diverse part discovery: occluded person re-identification with part-aware transformer. In CVPR. 2898--2907.
[14]
Shengcai Liao and Ling Shao. 2021. Transformer-based deep image matching for generalizable person re-identification. arXiv preprint arXiv:2105.14432 (2021).
[15]
Hangyu Lin, Yanwei Fu, Peng Lu, Shaogang Gong, Xiangyang Xue, and Yugang Jiang. 2019. TC-Net for iSBIR: Triplet classification network for instance-level sketch based image retrieval. In ACMMM. 1676--1684.
[16]
Zhipu Liu, Lei Zhang, and Yang Yang. 2020. Hierarchical bi-directional feature perception network for person re-identification. In ACMMM. 4289--4298.
[17]
Zhongxing Ma, Yifan Zhao, and Jia Li. 2021. Pose-guided inter- and intra-part relational transformer for occluded person re-identification. In ACMMM. 1487--1496.
[18]
Lu Pang, Yaowei Wang, Yi-Zhe Song, Tiejun Huang, and Yonghong Tian. 2018. Cross-domain adversarial feature learning for sketch re-identification. In ACMMM. 609--617.
[19]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. NeurlPS, Vol. 32, 8026--8037.
[20]
Patsorn Sangkloy, Nathan Burnell, Cusuh Ham, and James Hays. 2016. The sketchy database: learning to retrieve badly drawn bunnies. ACM Transactions on Graphics, Vol. 35, 4 (2016), 1--12.
[21]
Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2018. Learning to sketch with shortcut cycle consistency. In CVPR. 801--810.
[22]
Jifei Song, Qian Yu, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2017a. Deep spatial-semantic attention for fine-grained sketch-based image retrieval. In ICCV. 5551--5560.
[23]
Jifei Song, Yi zhe Song, Tony Xiang, and Timothy Hospedales. 2017b. Fine-Grained Image Retrieval: the Text/Sketch Input Dilemma. In BMVC. 45.1--45.12.
[24]
Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Enze Xie, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for balance
[25]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.
[26]
Xiaogang Wang, Gianfranco Doretto, Thomas Sebastian, Jens Rittscher, and Peter Tu. 2007. Shape and appearance context modeling. In ICCV. 1--8.
[27]
Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang, and Ping Luo. 2021. Segmenting transparent objects in the wild with transformer. In IJCAI. 1194--1200.
[28]
Lan Yang, Kaiyue Pang, Honggang Zhang, and Yi-Zhe Song. 2021. SketchAA: abstract representation for abstract sketches. In ICCV. 10077--10086.
[29]
Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, and Chen Change Loy. 2016. Sketch me that shoe. In CVPR. 799--807.
[30]
Qian Yu, Jifei Song, Yi-Zhe Song, Tao Xiang, and Timothy M. Hospedales. 2021. Fine-grained instance-level sketch-based image retrieval. International Journal of Computer Vision, Vol. 129 (2021), 484--500.
[31]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In ICCV. 1116--1124.
[32]
Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In AAAI. 13001--13008.
[33]
Kuan Zhu, Haiyun Guo, Shiliang Zhang, Yaowei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, and Ming Tang. 2021. AAformer: auto-aligned transformer for person re-identification. arXiv preprint arXiv:2104.00921 (2021).

Cited By

View all
  • (2024)LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIRIEEE Transactions on Big Data10.1109/TBDATA.2024.335639310:4(543-555)Online publication date: Aug-2024
  • (2024)Oriented R-CNN With Disentangled Representations for Product Packaging DetectionIEEE Photonics Journal10.1109/JPHOT.2024.345029516:5(1-11)Online publication date: Oct-2024
  • (2024)Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01596(16870-16879)Online publication date: 16-Jun-2024
  • Show More Cited By

Index Terms

  1. Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cross transplantation
      2. cross-compatible embedding
      3. semantic consistent features
      4. sketch re-identification

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China

      Conference

      MM '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)75
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 20 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIRIEEE Transactions on Big Data10.1109/TBDATA.2024.335639310:4(543-555)Online publication date: Aug-2024
      • (2024)Oriented R-CNN With Disentangled Representations for Product Packaging DetectionIEEE Photonics Journal10.1109/JPHOT.2024.345029516:5(1-11)Online publication date: Oct-2024
      • (2024)Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01596(16870-16879)Online publication date: 16-Jun-2024
      • (2024)A Systematic Literature Review of Deep Learning Approaches for Sketch-Based Image Retrieval: Datasets, Metrics, and Future DirectionsIEEE Access10.1109/ACCESS.2024.335793912(14847-14869)Online publication date: 2024
      • (2024)A review on video person re-identification based on deep learningNeurocomputing10.1016/j.neucom.2024.128479(128479)Online publication date: Aug-2024
      • (2024)Transformer for Object Re-identification: A SurveyInternational Journal of Computer Vision10.1007/s11263-024-02284-4Online publication date: 23-Nov-2024
      • (2023)Survey of Cross-Modal Person Re-Identification from a Mathematical PerspectiveMathematics10.3390/math1103065411:3(654)Online publication date: 28-Jan-2023
      • (2023)Context-aware lightweight remote-sensing image super-resolution networkFrontiers in Neurorobotics10.3389/fnbot.2023.122016617Online publication date: 23-Jun-2023
      • (2023)Hir-net: a simple and effective heterogeneous image restoration networkSignal, Image and Video Processing10.1007/s11760-023-02779-618:1(773-784)Online publication date: 16-Oct-2023
      • (2023)Semantic consistent feature construction and multi-granularity feature learning for visible-infrared person re-identificationThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-02923-w40:4(2363-2379)Online publication date: 27-Jun-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media