Cross-modal pedestrian re-recognition based on attention mechanism

Yuyao Zhao ORCID: orcid.org/0000-0002-6711-8199¹,
Hang Zhou¹,
Hai Cheng¹ &
…
Chunguang Huang¹

220 Accesses
2 Citations
Explore all metrics

Abstract

Person re-identification, as an essential research direction in intelligent security, has gained the focus of researchers and scholars. In practical scenarios, visible light cameras depend highly on lighting conditions and have limited detection capability in poor light. Therefore, many scholars have gradually shifted their research goals to cross-modality person re-identification. However, there are few relevant studies, and challenges remain in resolving the differences in the images of different modalities. In order to solve these problems, this paper will use the research method based on the attention mechanism to narrow the difference between the two modes and guide the network in a more appropriate direction to improve the recognition performance of the network. Aiming at the problem of using the attention mechanism method can improve training efficiency. However, it is easy to cause the model training instability. This paper proposes a cross-modal pedestrian re-recognition method based on the attention mechanism. A new attention mechanism module is designed to allow the network to use less time to focus on more critical features of a person. In addition, a cross-modality hard center triplet loss is designed to supervise the model training better. The paper has conducted extensive experiments on the above two methods on two publicly available datasets, which obtained better performance than similar current methods and verified the effectiveness and feasibility of the proposed methods in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A part-based attention network for person re-identification

Article 25 May 2020

Research on person re-identification based on multi-level attention model

Article 16 March 2024

Joint learning dynamic pruning and attention for person re-identification

Article 29 April 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

Not applicable.

References

Gong, S., Xiang, T.: Person re-identification[M]. London: Springer. 301-313. (2011)
Gong, S., Xiang, T.: Person re-identification[M]. London: Springer. 301-313. (2011)
Nguyen, D.T., Hong, H.G., Kim, K.W., et al.: Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors 17(3), 605 (2017)
Article Google Scholar
Liu, H., Cheng, J., Wang, W., et al.: Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398, 11–19 (2020)
Article Google Scholar
Ye, M., Shen, J., Lin, G., et al.: Deep learning for person re-identification: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-20. (2021)
Qi, M., Wang, S., Huang, G., et al.: Mask-guided dual attention-aware network for visible-infrared person re-identification. Multimed. Tools Appl. 80(12), 17645–17666 (2021)
Article Google Scholar
Ye, M., Shen, J., J Crandall, D., et al.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. Eur. Conf. Comput. Vis. 229-247. (2020)
Dai, P., Ji, R., Wang, H., et al.: Cross-modality person re-identification with generative adversarial training. IJCAI, 677-683. (2018)
Wang, Z., Wang, Z., Zheng, Y., et al.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 618-626. (2019)
Ye, M., Wang, Z., Lan, X., et al.: Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. IJCAI, 1092-1099. (2018)
Ye, M., Lan, X., Li, J., et al.: Hierarchical discriminative learning for visible thermal person re-identification. Thirty-Second AAAI conference on artificial intelligence, 7501-7508. (2018)
Hao, Y., Wang, N., Li, J., et al.: HSME: hypersphere manifold embedding for visible thermal person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 8385-8392. (2019)
Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. Proceedings of the AAAI Conference on Artificial Intelligence, 4610-4617. (2020)
Wang, G.a., Zhang, T., Cheng, J., et al.: RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment. Proceedings of the IEEE International Conference on Computer Vision, 3623-3632. (2019)
Hao, Y., Li, J., Wang, N., et al.: Modality adversarial neural network for visible-thermal person re-identification. Pattern Recogn. 107, 107533 (2020)
Article Google Scholar
Lin, J.-W., Li, H.: HPILN: A feature learning framework for cross-modality person re-identification. IET Image Proc. 13(14), 2897–2904 (2019)
Article Google Scholar
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, (2017)
Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 29, 579–590 (2019)
Article MathSciNet Google Scholar
Wang, G.-A., Zhang, T., Yang, Y., et al.: Cross-modality paired-images generation for rgb-infrared person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 12144-12151. (2020)
Ye, M., Wang, Z., Lan, X., et al.: Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. IJCAI, 1092-1099. (2018)
Shu, X., Xu, B., Zhang, L., Tang, J.: Multi-granularity anchor-contrastive representation learning for semi-supervised skeleton-based action recognition, In: IEEE Transactions on Pattern Analysis and Machine Intelligence, (2022). https://doi.org/10.1109/TPAMI.2022.3222871
Shu, X., Yang, J., Yan, R., Song, Y.: Expansion-Squeeze-Excitation Fusion Network for Elderly Activity Recognition. IEEE Trans. Circuits Syst. Video Technol. 32(8), 5281–5292 (2022). https://doi.org/10.1109/TCSVT.2022.3142771
Article Google Scholar
Xu, B., Shu, X., Song, Y.: X-Invariant contrastive augmentation and representation learning for semi-supervised skeleton-based action recognition. IEEE Trans. Image Process. 31, 3852–3867 (2022). https://doi.org/10.1109/TIP.2022.3175605
Article Google Scholar
Luo, Hao, Jiang, Wei, Fan, Xing: etc. Research progress of pedestrian recognizance based on deep learning. Acta Automatica Sinica, 45(11): 2032-2049. (2019)
Frng, Xia, Du, Jiahao, Duan, Yinong: etc. A review of pedestrian recognizance based on deep learning. Computer Application Research, 37(11): 3220-3226+3240. (2020)
Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., et al.: Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision, 618-626. (2017)
Liu, D., Cui, Y., Tan, W., et al.: SG-Net: spatial granularity network for one-stage video instance segmentation. (2021)
Bellen, E., Mendoza, J., Seroy, D., et al.: Integrated visual-based ASL captioning in videoconferencing using CNN// TENCON 2022 - 2022 IEEE Region 10 Conference (TENCON). 0
Cui, Y., Yan, L., Cao, Z., et al.: TF-Blender: Temporal feature blender for video object detection. (2021)

Download references

Funding

This work is supported by the National Natural Science Foundation of China (51607059), Heilongjiang Postdoctoral Financial Assistance, China (LBH-Z20188) and the Basic Science Research Project of Heilongjiang Univesity, China (KJCX201904,2020-KYYWF-1001).

Author information

Authors and Affiliations

College of Electronic and Engineering, Heilongjiang University, Heilongjiang Province, China
Yuyao Zhao, Hang Zhou, Hai Cheng & Chunguang Huang

Authors

Yuyao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hai Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Chunguang Huang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Not applicable.

Corresponding author

Correspondence to Hai Cheng.

Ethics declarations

Conflict of interest

Not applicable.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Code availability

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhao, Y., Zhou, H., Cheng, H. et al. Cross-modal pedestrian re-recognition based on attention mechanism. Vis Comput 40, 2405–2418 (2024). https://doi.org/10.1007/s00371-023-02926-7

Download citation

Accepted: 25 May 2023
Published: 07 July 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00371-023-02926-7

Cross-modal pedestrian re-recognition based on attention mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A part-based attention network for person re-identification

Research on person re-identification based on multi-level attention model

Joint learning dynamic pruning and attention for person re-identification

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Cross-modal pedestrian re-recognition based on attention mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A part-based attention network for person re-identification

Research on person re-identification based on multi-level attention model

Joint learning dynamic pruning and attention for person re-identification

Explore related subjects

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation