An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality

Haiyun Tao¹⁵,
Yukang Zhang¹⁵,
Yang Lu¹⁵ &
…
Hanzi Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14433))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

932 Accesses
1 Citations

Abstract

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Due to the significant cross-modality discrepancy, it is difficult to learn discriminative features. Attention-based methods have been widely utilized to extract discriminative features for VI-ReID. However, the existing methods are confined by first-order structures that just exploit simple and coarse information. The existing approach lacks the sufficient capability to learn both modality-irrelevant and modality-relevant features. In this paper, we extract the second-order information from mid-level features to complement the first-order cues. Specifically, we design a flexible second-order module, which considers the correlations between the common features and learns refined feature representations for pedestrian images. Additionally, the visible and infrared modality has a significant gap. Therefore, we propose a plug-and-play mixed intermediate modality module to generate intermediate modality representations to reduce the modality discrepancy between the visible and infrared features. Extensive experimental results on two challenging datasets SYSU-MM01 and RegDB demonstrate that our method considerably achieves competitive performance compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cai, S., Zuo, W., Zhang, L.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the ICCV, pp. 511–520 (2017)
Google Scholar
Chen, B., Deng, W., Hu, J.: Mixed high-order attention network for person re-identification. In: Proceedings of the ICCV, pp. 371–381 (2019)
Google Scholar
Chen, C., Ye, M., Qi, M., Wu, J., Jiang, J., Lin, C.W.: Structure-aware positional transformer for visible-infrared person re-identification. In: IEEE TIP, pp. 2352–2364 (2022)
Google Scholar
Chen, D., Wu, P., Jia, T., Xu, F.: Hob-net: high-order block network via deep metric learning for person re-identification. Appl. Intell. 52(5), 4844–4857 (2022)
Google Scholar
Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for RGB-infrared person re-identification. In: Proceedings of the CVPR, pp. 587–597 (2021)
Google Scholar
Dai, Y., Liu, J., Sun, Y., Tong, Z., Zhang, C., Duan, L.Y.: IDM: an intermediate domain module for domain adaptive person re-id. In: Proceedings of the ICCV, pp. 11844–11854 (2021)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the CVPR, pp. 248–255 (2009)
Google Scholar
Fan, X., Zhang, Y., Lu, Y., Wang, H.: Parformer: transformer-based multi-task network for pedestrian attribute recognition. In: IEEE TCSVT, p. 1 (2023)
Google Scholar
Fu, C., Hu, Y., Wu, X., Shi, H., Mei, T., He, R.: CM-NAS: cross-modality neural architecture search for visible-infrared person re-identification. In: Proceedings of the ICCV, pp. 11803–11812 (2021)
Google Scholar
Gao, Y., et al.: MSO: multi-feature space joint optimization network for RGB-infrared person re-identification. In: Proceedings of the 29th ACM MM, pp. 5257–5265 (2021)
Google Scholar
Hao, X., Zhao, S., Ye, M., Shen, J.: Cross-modality person re-identification via modality confusion and center aggregation. In: Proceedings of the ICCV, pp. 16383–16392 (2021)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the CVPR, pp. 770–778 (2016)
Google Scholar
Huang, Z., Liu, J., Li, L., Zheng, K., Zha, Z.J.: Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification. In: Proceedings of the AAAI, pp. 1034–1042 (2022)
Google Scholar
Jacob, P., Picard, D., Histace, A., Klein, E.: Metric learning with HORDE: high-order regularizer for deep embeddings. In: Proceedings of the ICCVw, pp. 6539–6548 (2019)
Google Scholar
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Google Scholar
Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI, pp. 4610–4617 (2020)
Google Scholar
Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: Proceedings of the ICCV, pp. 2089–2097 (2017)
Google Scholar
Liu, L., Zhang, Y., Chen, J., Gao, C.: Fusing global and semantic-part features with multiple granularities for person re-identification. In: 2019 IEEE ISPA/BDCloud/SocialCom/SustainCom, pp. 1436–1440 (2019)
Google Scholar
Lu, H., Zou, X., Zhang, P.: Learning progressive modality-shared transformers for effective visible-infrared person re-identification. In: Proceedings of the AAAI, pp. 1835–1843 (2022)
Google Scholar
Luo, H., et al.: A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multim. 22(10), 2597–2609 (2020)
Google Scholar
Nguyen, D., Hong, H., Kim, K., Park, K.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)
Google Scholar
Park, H., Lee, S., Lee, J., Ham, B.: Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the ICCV, pp. 12026–12035 (2021)
Google Scholar
Shao, R., Lan, X., Li, J., Yuen, P.C.: Multi-adversarial discriminative deep domain generalization for face presentation attack detection. In: Proceedings of the CVPR, pp. 10015–10023 (2019)
Google Scholar
Sun, H., et al.: Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the ACM MM, pp. 5333–5341 (2022)
Google Scholar
Tay, C.P., Roy, S., Yap, K.H.: Aanet: attribute attention network for person re-identifications. In: Proceedings of the CVPR, pp. 7127–7136 (2019)
Google Scholar
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the ICCV, pp. 3622–3631 (2019)
Google Scholar
Wang, G.-A., et al.: Cross-modality paired-images generation for RGB-infrared person re-identification. Proc. AAAI Conf. Artif. Intell. 34(07), 12144–12151 (2020)
Google Scholar
Wei, Z., Yang, X., Wang, N., Gao, X.: Syncretic modality collaborative learning for visible infrared person re-identification. In: Proceedings of the ICCV, pp. 225–234 (2021)
Google Scholar
Woo, S., Park, J., Lee, J., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the ECCV, pp. 3–19 (2018)
Google Scholar
Wu, A., Zheng, W.S., Yu, H.X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: Proceedings of the ICCV, pp. 5390–5399 (2017)
Google Scholar
Wu, Q., et al.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 4328–4337 (2021)
Google Scholar
Xu, J., Zhao, R., Zhu, F., Wang, H., Ouyang, W.: Attention-aware compositional network for person re-identification. In: Proceedings of the CVPR, pp. 2119–2128 (2018)
Google Scholar
Yan, Y., Lu, Y., Wang, H.: Towards a unified middle modality learning for visible-infrared person re-identification. In: Proceedings of the ACM MM, pp. 788–796 (2021)
Google Scholar
Yang, M., Huang, Z., Hu, P., Li, T., Lv, J., Peng, X.: Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 14288–14297 (2022)
Google Scholar
Ye, M., Lan, X., Li, J., Yuen, P.C.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI, pp. 7501–7508 (2018)
Google Scholar
Ye, M., Ruan, W., Du, B., Shou, M.Z.: Channel augmented joint learning for visible-infrared recognition. In: Proceedings of the ICCV, pp. 13547–13556 (2021)
Google Scholar
Ye, M., Shen, J., Crandall, D.J., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Proceedings of the ECCV, pp. 229–247 (2020)
Google Scholar
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. In: IEEE TPAMI, pp. 2872–2893 (2022)
Google Scholar
Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. In: IEEE TIFS, pp. 728–739 (2021)
Google Scholar
Zhang, Y., Wang, H.: Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 2153–2162 (2023)
Google Scholar
Zhang, Y., Yan, Y., Li, J., Wang, H.: MRCN: a novel modality restitution and compensation network for visible-infrared person re-identification. In: Proceedings of the AAAI, pp. 3498–3506 (2023)
Google Scholar
Zhao, Z., Liu, B., Chu, Q., Lu, Y., Yu, N.: Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In: Proceedings of the AAAI, pp. 3520–3528 (2021)
Google Scholar
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI, pp. 13001–13008 (2020)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant U21A20514, 62002302, by the FuXiaQuan National Independent Innovation Demonstration Zone Collaborative Innovation Platform Project under Grant 3502ZCQXT2022008, and by the China Fundamental Research Funds for the Central Universities under Grants 20720230038.

Author information

Authors and Affiliations

Fujian Key Laboratory of Sensing and Computing for Smart City, School of Informatics, Xiamen University, Xiamen, China
Haiyun Tao, Yukang Zhang, Yang Lu & Hanzi Wang

Authors

Haiyun Tao
View author publications
You can also search for this author in PubMed Google Scholar
Yukang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Hanzi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Lu .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tao, H., Zhang, Y., Lu, Y., Wang, H. (2024). An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14433. Springer, Singapore. https://doi.org/10.1007/978-981-99-8546-3_10

Download citation

DOI: https://doi.org/10.1007/978-981-99-8546-3_10
Published: 26 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8545-6
Online ISBN: 978-981-99-8546-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics