Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Cross-modal pedestrian re-recognition based on attention mechanism

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Person re-identification, as an essential research direction in intelligent security, has gained the focus of researchers and scholars. In practical scenarios, visible light cameras depend highly on lighting conditions and have limited detection capability in poor light. Therefore, many scholars have gradually shifted their research goals to cross-modality person re-identification. However, there are few relevant studies, and challenges remain in resolving the differences in the images of different modalities. In order to solve these problems, this paper will use the research method based on the attention mechanism to narrow the difference between the two modes and guide the network in a more appropriate direction to improve the recognition performance of the network. Aiming at the problem of using the attention mechanism method can improve training efficiency. However, it is easy to cause the model training instability. This paper proposes a cross-modal pedestrian re-recognition method based on the attention mechanism. A new attention mechanism module is designed to allow the network to use less time to focus on more critical features of a person. In addition, a cross-modality hard center triplet loss is designed to supervise the model training better. The paper has conducted extensive experiments on the above two methods on two publicly available datasets, which obtained better performance than similar current methods and verified the effectiveness and feasibility of the proposed methods in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

Not applicable.

References

  1. Gong, S., Xiang, T.: Person re-identification[M]. London: Springer. 301-313. (2011)

  2. Gong, S., Xiang, T.: Person re-identification[M]. London: Springer. 301-313. (2011)

  3. Nguyen, D.T., Hong, H.G., Kim, K.W., et al.: Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors 17(3), 605 (2017)

    Article  Google Scholar 

  4. Liu, H., Cheng, J., Wang, W., et al.: Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398, 11–19 (2020)

    Article  Google Scholar 

  5. Ye, M., Shen, J., Lin, G., et al.: Deep learning for person re-identification: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-20. (2021)

  6. Qi, M., Wang, S., Huang, G., et al.: Mask-guided dual attention-aware network for visible-infrared person re-identification. Multimed. Tools Appl. 80(12), 17645–17666 (2021)

    Article  Google Scholar 

  7. Ye, M., Shen, J., J Crandall, D., et al.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. Eur. Conf. Comput. Vis. 229-247. (2020)

  8. Dai, P., Ji, R., Wang, H., et al.: Cross-modality person re-identification with generative adversarial training. IJCAI, 677-683. (2018)

  9. Wang, Z., Wang, Z., Zheng, Y., et al.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 618-626. (2019)

  10. Ye, M., Wang, Z., Lan, X., et al.: Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. IJCAI, 1092-1099. (2018)

  11. Ye, M., Lan, X., Li, J., et al.: Hierarchical discriminative learning for visible thermal person re-identification. Thirty-Second AAAI conference on artificial intelligence, 7501-7508. (2018)

  12. Hao, Y., Wang, N., Li, J., et al.: HSME: hypersphere manifold embedding for visible thermal person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 8385-8392. (2019)

  13. Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. Proceedings of the AAAI Conference on Artificial Intelligence, 4610-4617. (2020)

  14. Wang, G.a., Zhang, T., Cheng, J., et al.: RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment. Proceedings of the IEEE International Conference on Computer Vision, 3623-3632. (2019)

  15. Hao, Y., Li, J., Wang, N., et al.: Modality adversarial neural network for visible-thermal person re-identification. Pattern Recogn. 107, 107533 (2020)

    Article  Google Scholar 

  16. Lin, J.-W., Li, H.: HPILN: A feature learning framework for cross-modality person re-identification. IET Image Proc. 13(14), 2897–2904 (2019)

    Article  Google Scholar 

  17. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, (2017)

  18. Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 29, 579–590 (2019)

    Article  MathSciNet  Google Scholar 

  19. Wang, G.-A., Zhang, T., Yang, Y., et al.: Cross-modality paired-images generation for rgb-infrared person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 12144-12151. (2020)

  20. Ye, M., Wang, Z., Lan, X., et al.: Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking. IJCAI, 1092-1099. (2018)

  21. Shu, X., Xu, B., Zhang, L., Tang, J.: Multi-granularity anchor-contrastive representation learning for semi-supervised skeleton-based action recognition, In: IEEE Transactions on Pattern Analysis and Machine Intelligence, (2022). https://doi.org/10.1109/TPAMI.2022.3222871

  22. Shu, X., Yang, J., Yan, R., Song, Y.: Expansion-Squeeze-Excitation Fusion Network for Elderly Activity Recognition. IEEE Trans. Circuits Syst. Video Technol. 32(8), 5281–5292 (2022). https://doi.org/10.1109/TCSVT.2022.3142771

    Article  Google Scholar 

  23. Xu, B., Shu, X., Song, Y.: X-Invariant contrastive augmentation and representation learning for semi-supervised skeleton-based action recognition. IEEE Trans. Image Process. 31, 3852–3867 (2022). https://doi.org/10.1109/TIP.2022.3175605

    Article  Google Scholar 

  24. Luo, Hao, Jiang, Wei, Fan, Xing: etc. Research progress of pedestrian recognizance based on deep learning. Acta Automatica Sinica, 45(11): 2032-2049. (2019)

  25. Frng, Xia, Du, Jiahao, Duan, Yinong: etc. A review of pedestrian recognizance based on deep learning. Computer Application Research, 37(11): 3220-3226+3240. (2020)

  26. Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)

    Article  Google Scholar 

  27. Selvaraju, R.R., Cogswell, M., Das, A., et al.: Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision, 618-626. (2017)

  28. Liu, D., Cui, Y., Tan, W., et al.: SG-Net: spatial granularity network for one-stage video instance segmentation. (2021)

  29. Bellen, E., Mendoza, J., Seroy, D., et al.: Integrated visual-based ASL captioning in videoconferencing using CNN// TENCON 2022 - 2022 IEEE Region 10 Conference (TENCON). 0

  30. Cui, Y., Yan, L., Cao, Z., et al.: TF-Blender: Temporal feature blender for video object detection. (2021)

Download references

Funding

This work is supported by the National Natural Science Foundation of China (51607059), Heilongjiang Postdoctoral Financial Assistance, China (LBH-Z20188) and the Basic Science Research Project of Heilongjiang Univesity, China (KJCX201904,2020-KYYWF-1001).

Author information

Authors and Affiliations

Authors

Contributions

Not applicable.

Corresponding author

Correspondence to Hai Cheng.

Ethics declarations

Conflict of interest

Not applicable.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Code availability

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Zhou, H., Cheng, H. et al. Cross-modal pedestrian re-recognition based on attention mechanism. Vis Comput 40, 2405–2418 (2024). https://doi.org/10.1007/s00371-023-02926-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02926-7

Keywords

Navigation