Abstract
Cross-modality visible-infrared person re-identification (VI-ReID) aims to match visible and infrared pedestrian images from different cameras in various scenarios. However, most existing VI-ReID methods only focus on eliminating the modality discrepancy while ignoring the intra-class discrepancy caused by different camera styles. In addition, some feature fusion-based VI-ReID methods try to improve the discriminative capability of pedestrian representations by fusing pedestrian features from different convolutional layers or branches. However, most of them only implement feature fusion by simple operations, such as summation or concatenation, and ignore the interaction between different feature maps. To this end, we propose a camera style-invariant learning and channel interaction enhancement fusion network for VI-ReID. In particular, we design a channel interaction enhancement fusion module. It first computes and utilizes the channel-level similarity matrix of two feature maps to obtain two corresponding weighted feature maps that enhance the common concern information of the original two feature maps. Then, it obtains more discriminative pedestrian features by fusing the two weighted feature maps and mining their complementary information. Furthermore, in order to weaken the impact of camera style discrepancy of pedestrian images, we design a camera style-invariant feature-level adversarial learning strategy to ensure that the feature extraction network can extract camera style-invariant pedestrian features by the adversarial learning between the feature extraction network and the camera style classifier. Extensive experimental results on the two benchmark datasets, SYSU-MM01 and RegDB, demonstrate that the performance of CC-Net achieves the recent advanced level.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Code Availability
The code that supports the findings of this study is available from the corresponding author upon reasonable request.
References
Zhu, X., Jing, X., You, X., Zuo, W., Shan, S., Zheng, W.: Image to video person re-identification by learning heterogeneous dictionary pair with feature projection matrix. IEEE Trans. Inf. Forensics Secur. 13(3), 717–732 (2018)
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2022)
Li, Y., Jiang, X., Hwang, J.: Effective person re-identification by self-attention model guided feature learning. Knowl. Based Syst. 187, 104832 (2020)
Bai, S., Tang, P., Torr, P., Latecki, L.: Re-ranking via metric fusion for object retrieval and person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 740–749 (2019)
Kalayeh, M., Basaran, E., Gokmen, M., Kamasak, M., Shah, M.: Human semantic parsing for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1062–1071 (2018)
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 274–282 (2018)
Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8506–8514 (2019)
Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., Chen, X.: Interaction-and-aggregation network for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9309–9318 (2019)
Martinel, N., Foresti, G., Micheloni, C.: Deep pyramidal pooling with attention for person re-identification. IEEE Trans. Image Process. 29, 7306–7316 (2020)
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1487–1495 (2019)
Wu, A., Zheng, W., Yu, H., Gong, S., Lai, J.: Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 5390–5399 (2017)
Liu, H., Ma, S., Xia, D., Li, S.: SFANet: a spectrum-aware feature augmentation network for visible-infrared person re-identification. IEEE Trans. Neural Netw. Syst. 34(4), 1958–1971 (2023)
Liu, Q., He, X., Zhang, M., Teng, Q., Li, B., Qing, L.: Feature separation and double causal comparison loss for visible and infrared person re-identification. Knowl. Based Syst. 239, 108042 (2022)
Cho, Y., Yoon, K.: Improving person re-identification via pose-aware multi-shot matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1354–1362 (2016)
Zhao, H., Tian, M., Sun, S., Shao, J., Yan, J., Yi, S., Wang, X., Tang, X.: Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 907–915 (2017)
Sarfraz, M., Schumann, A., Eberle, A., Stiefelhagen, R.: A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 420–429 (2018)
Cheng, D., Li, X., Qi, M., Liu, X., Chen, C., Niu, D.: Exploring cross-modality commonalities via dual-stream multi-branch network for infrared-visible person re-identification. IEEE Access. 8, 12824–12834 (2020)
Guo, S., Xu, L., Feng, C., Xiong, H., Gao, Z., Zhang, H.: Multi-level semantic adaptation for few-shot segmentation on cardiac image sequences. Med. Image Anal. 73, 102170 (2021)
Wu, Z., Allibert, G., Meriaudeau, F., Ma, C., Demonceaux, C.: Hidanet: Rgb-d salient object detection via hierarchical depth awareness. IEEE Trans. Image Process. 32, 2160–2173 (2023)
Feng, J., Wu, A., Zheng, W.-S.: Shape-erased feature learning for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22752–22761 (2023)
Lan, L., Teng, X., Zhang, J., Zhang, X., Tao, D.: Learning to purification for unsupervised person re-identification. IEEE Trans. Image Process. 32, 3338–3353 (2023)
Zhu, Y., Yang, Z., Wang, L., Zhao, S., Hu, X., Tao, D.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97–109 (2020)
Sun, J., Li, Y., Chen, H., Peng, Y., Zhu, X., Zhu, J.: Visible-infrared cross-modality person re-identification based on whole-individual training. Neurocomputing 440, 1–11 (2021)
Ran, L., Hong, Y., Zhang, S., Yang, Y., Zhang, Y.: Improving visible-thermal ReID with structural common space embedding and part models. Pattern Recogn. Lett. 142, 25–31 (2021)
Zhang, J., Li, X., Chen, C., Qi, M., Wu, J., Jiang, J.: Global-local graph convolutional network for cross-modality person re-identification. Neurocomputing 452, 137–146 (2021)
Liu, H., Cheng, J., Wang, W., Su, Y., Bai, H.: Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398, 11–19 (2020)
Xiang, S., Chen, H., Ran, W., Yu, Z., Liu, T., Qian, D., Fu, Y.: Deep Multimodal Fusion for Generalizable Person Re-identification 18(9), 1–9 arXiv:2211.00933 (2022)
Ye, M., Wang, Z., Lan, X., Yuen, P.: Visible thermal person re-identification via dual-constrained top-ranking. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1092–1099 (2018)
Ye, M., Lan, X., Li, J., Yuen, P.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence(AAAI), pp. 7501–7508 (2018)
Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 29, 579–590 (2020)
Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 677–683 (2018)
Xia, D., Liu, H., Xu, L., Wang, L.: Visible-infrared person re-identification with data augmentation via cycle-consistent adversarial network. Neurocomputing 443, 35–46 (2021)
Wang, Z., Wang, Z., Zheng, Y., Chuang, Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 618–626 (2019)
Liao, S., Hu, Y., Zhu, X., Li, S.: Person re-identification by Local Maximal Occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 07-12-June, pp. 2197–2206 (2015)
Ke, Q., Bennamoun, M., Rahmani, H., An, S., Sohel, F., Boussaid, F.: Identity adaptation for person re-identification. IEEE Access. 6, 48147–48155 (2018)
Lin, Y., Zheng, L., Zheng, Z., Wu, Y., Hu, Z., Yan, C., Yang, Y.: Improving person re-identification by attribute and identity learning. Pattern Recogn. 95, 151–161 (2019)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1335–1344 (2016)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)
Liao, S., Li, S.: Efficient psd constrained asymmetric metric learning for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3685–3693 (2015)
Liu, J., Sun, Y., Zhu, F., Pei, H., Yang, Y., Li, W.: Learning Memory-Augmented Unidirectional Metrics for Cross-modality Person Re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19344–19353 (2022)
Li, X., Lu, Y., Liu, B., Liu, Y., Yin, G., Chu, Q., Huang, J., Zhu, F., Zhao, R., Yu, N.: Counterfactual intervention feature transfer for visible-infrared person re-identification. In: Proceedings of the European Conference on Computer Vision, pp. 381–398. Springer (2022)
Huang, N., Liu, J., Miao, Y., Zhang, Q., Han, J.: Deep learning for visible-infrared cross-modality person re-identification: a comprehensive review. Information Fusion. 91, 396–411 (2023)
Ye, M., Shen, J., Crandall, D., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Lecture Notes in Computer Science (including Subbooktitle Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 299–247 (2020)
Ye, M., Lan, X., Leng, Q., Shen, J.: Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans. Image Process. 29, 9387–9399 (2020)
Zhang, Y., Yan, Y., Lu, Y., Wang, H.: Towards a unified middle modality learning for visible-infrared person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 788–796 (2021)
Liu, H., Tan, X., Zhou, X.: Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. Knowl. Based Syst. IEEE Trans. Multimedia. 23, 4414–4425 (2021)
Wu, Q., Dai, P., Chen, J., Lin, C., Wu, Y., Huang, F., Zhong, B., Ji, R.: Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4328–4337 (2021)
Choi, S., Lee, S., Kim, Y., Kim, T., Kim, C.: Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10257–10266 (2020)
Zhang, S., Yang, Y., Wang, P., Liang, G., Zhang, X., Zhang, Y.: Attend to the difference: cross-modality person re-identification via contrastive correlation. IEEE Trans. Image Process. 30, 8861–8872 (2021)
Wan, L., Sun, Z., Jing, Q., Chen, Y., Lu, L., Li, Z.: G2da: geometry-guided dual-alignment learning for RGB-infrared person re-identification. Pattern Recogn. 135, 109150 (2023)
Sun, Z., Zhao, F.: Counterfactual attention alignment for visible-infrared cross-modality person re-identification. Pattern Recogn. Lett. 168, 79–85 (2023)
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3622–3631 (2019)
Wang, G., Yang, Y., Zhang, T., Cheng, J., Hou, Z., Tiwari, P., Pandey, H.: Cross-modality paired-images generation and augmentation for RGB-infrared person re-identification. Neural Netw. 128, 294–304 (2020)
Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation. arXiv preprint arXiv:1903.11816 (2019)
Xiong, L., Karlekar, J., Zhao, J., Cheng, Y., Xu, Y., Feng, J., Pranata, S., Shen, S.: A good practice towards top performance of face recognition: transferred deep feature fusion. arXiv preprint arXiv:1704.00438 (2017)
Nguyen, D., Hong, H., Kim, K., Park, K.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Wei, Z., Yang, X., Wang, N., Gao, X.: Syncretic modality collaborative learning for visible infrared person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 225–234 (2021)
Ye, M., Lan, X., Leng, Q.: Modality-aware collaborative learning for visible thermal person re-identification. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 347–355 (2019)
Ye, M., Lan, X., Wang, Z., Yuen, P.: Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans. Inf. Forensics Secur. 15, 407–419 (2020)
Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), pp. 4610–4617 (2020)
Wang, G., Zhang, T., Yang, Y., Cheng, J., Chang, J., Liang, X., Hou, Z.: Cross-modality paired-images generation for RGB-infrared person re-identification. In: Proceedings of the Association for the Advance of Artificial Intelligence (AAAI), pp. 12144–12151 (2020)
Basaran, E., Gokmen, M., Kamasak, M.: An efficient framework for visible-infrared cross modality person re-identification. Signal Process. Image Commun. 87, 115933 (2020)
Zhang, C., Liu, H., Guo, W., Ye, M.: Multi-scale cascading network with compact feature learning for rgb-infrared person re-identification. In: Proceedings of the 25th International Conference on Pattern Recognition (ICPR), pp. 8679–8686 (2021)
Cai, X., Liu, L., Zhu, L., Zhang, H.: Dual-modality hard mining triplet-center loss for visible infrared person re-identification. Knowl. Based Syst. 215, 106772 (2021)
Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. IEEE Trans. Inf. Forensics Secur. 16, 728–739 (2021)
Lu, Y., Wu, Y., Liu, B., Zhang, T., Li, B., Chu, Q., Yu, N.: Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13376–13386 (2020)
Hao, X., Zhao, S., Ye, M., Shen, J.: Cross-Modality Person Re-Identification via Modality Confusion and Center Aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16383–16392 (2021)
Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 587–597 (2021)
Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128(2), 336–359 (2020)
Funding
This work was supported in part by the Science and Technology Development Plan Project of Henan Province, China (No. 222102110135) and the Natural Science Foundation of Henan Province, China (No. 202300410093).
Author information
Authors and Affiliations
Contributions
HD did methodology, writing—original draft preparation, writing—reviewing and editing, supervision, funding acquisition. XH contributed to conceptualization, writing—reviewing and editing. YY gave software, visualization, and data curation. LH was involved in investigation, software, and validation. JG performed investigation, visualization, and validation. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, H., Hao, X., Ye, Y. et al. A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification. Machine Vision and Applications 34, 117 (2023). https://doi.org/10.1007/s00138-023-01473-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-023-01473-4