Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Attentive multi-granularity perception network for person search

Published: 18 October 2024 Publication History

Abstract

Person search is an extremely challenging task that seeks to identify individuals through joint person detection and person re-identification from uncropped real scene images. Previous studies primarily focus on learning rich features to enhance identification. However, arbitrary feature enhancement strategies may introduce unwanted background noise. Moreover, different scenarios usually exhibit varying pedestrian appearances or even intricate occlusions, leading to inconsistent/incomplete pedestrian features in different images. In this paper, we introduce a novel Attentive Multi-granularity Perception (AMP) module seamlessly integrated into our AMPN network. This module specifically addresses appearance variations and occlusions within a person's Region of Interest (RoI). The AMP module harnesses discriminative relationship features from various local regions, significantly enhancing identification accuracy. It comprises two principal components: the Pedestrian Perception Enhancement (PPE) block and the Background Interference Suppressor (BIS). The PPE block introduces a Spatial-wise Feature Mixer and a Channel-wise Feature Mixer, which effectively capture and refine discriminative relation features. Simultaneously, the BIS operates in parallel with the PPE block, enriching the discriminative relation features and enhancing the distinctiveness between the foreground and background. Our AMP module is plug-and-play and can integrate with other person search models. Extensive experiments validate our model's merits, achieving state-of-the-art performance on CUHK-SYSU and a 4.8% mAP gain over SeqNet on PRW at a desirable speed. Our code is accessible at https://github.com/zqx951102/AMPN.

References

[1]
Z. Li, D. Miao, Sequential end-to-end network for efficient person search, in: Proc. AAAI Conf. Artif. Intell. (AAAI), 2021, pp. 2011–2019,.
[2]
Y. Xu, B. Ma, R. Huang, L. Lin, Person search in a scene by jointly modeling people commonness and person uniqueness, in: Proc. 22nd ACM Int. Conf. Multimedia (ACM Multimedia), 2014, pp. 937–940,.
[3]
L. Zheng, H. Zhang, S. Sun, M. Chandraker, Y. Yang, Q. Tian, Person re-identification in the wild, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 1367–1376,.
[4]
T. Xiao, S. Li, B. Wang, L. Lin, X. Wang, Joint detection and identification feature learning for person search, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3415–3424,.
[5]
K. Yuan, D. Miao, Y. Yao, H. Zhang, X. Zhao, Feature selection using zentropy-based uncertainty measure, IEEE Trans. Fuzzy Syst. 32 (4) (2024) 2246–2260,.
[6]
G. Zhang, W. Lin, A. kumar Chandran, X. Jing, Complementary networks for person re-identification, Inf. Sci. 633 (2023) 70–84,.
[7]
K. Wang, S. Dong, N. Liu, J. Yang, T. Li, Q. Hu, PA-Net: learning local features using by pose attention for short-term person re-identification, Inf. Sci. 565 (2021) 196–209,.
[8]
D. Chen, S. Zhang, W. Ouyang, J. Yang, Y. Tai, Person search via a mask-guided two-stream cnn model, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 734–750,.
[9]
X. Lan, X. Zhu, S. Gong, Person search by multi-scale matching, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 536–552,.
[10]
C. Han, J. Ye, Y. Zhong, X. Tan, C. Zhang, C. Gao, N. Sang, Re-id driven localization refinement for person search, in: Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2019, pp. 9814–9823,.
[11]
C. Wang, B. Ma, H. Chang, S. Shan, X. Chen, TCTS: a task-consistent two-stage framework for person search, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11952–11961,.
[12]
H. Yao, C. Xu, Joint person objectness and repulsion for person search, IEEE Trans. Image Process. 30 (2020) 685–696,.
[13]
J. Xiao, Y. Xie, T. Tillo, K. Huang, Y. Wei, J. Feng, IAN: the individual aggregation network for person search, Pattern Recognit. 87 (2019) 332–340,.
[14]
Y. Zhong, X. Wang, S. Zhang, Robust partial matching for person search in the wild, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 6827–6835,.
[15]
D. Chen, S. Zhang, J. Yang, B. Schiele, Norm-aware embedding for efficient person search, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 12615–12624,.
[16]
Y. Yan, J. Li, J. Qin, S. Bai, S. Liao, L. Liu, F. Zhu, L. Shao, Anchor-free person search, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 7690–7699,.
[17]
B.-J. Han, K. Ko, J.-Y. Sim, End-to-end trainable trident person search network using adaptive gradient propagation, in: Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2021, pp. 925–933,.
[18]
C. Zhao, Z. Chen, S. Dou, Z. Qu, J. Yao, J. Wu, D. Miao, Context-aware feature learning for noise robust person search, IEEE Trans. Circuits Syst. Video Technol. 32 (10) (2022) 7047–7060,.
[19]
L. Jaffe, A. Zakhor, Gallery filter network for person search, in: Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), 2023, pp. 1684–1693,.
[20]
J. Qin, P. Zheng, Y. Yan, R. Quan, X. Cheng, B. Ni, MovieNet-PS: a large-scale person search dataset in the wild, in: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2023, pp. 1–5,.
[21]
S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (6) (2016) 1137–1149,.
[22]
T. Wang, H. Liu, P. Song, T. Guo, W. Shi, Pose-guided feature disentangling for occluded person re-identification based on transformer, in: Proc. AAAI Conf. Artif. Intell. (AAAI), 2022, pp. 2540–2549,.
[23]
G. Wang, S. Yang, et al., High-order information matters: learning relation and topology for occluded person re-identification, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 6449–6458,.
[24]
R. Yu, D. Du, R. LaLonde, D. Davila, C. Funk, A. Hoogs, B. Clipp, Cascade transformers for end-to-end person search, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 7267–7276,.
[25]
M. Fiaz, H. Cholakkal, R.M. Anwer, F.S. Khan, SAT: scale-augmented transformer for person search, in: Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), 2023, pp. 4820–4829,.
[26]
C. Han, Z. Zheng, K. Su, D. Yu, Z. Yuan, C. Gao, N. Sang, Y. Yang, DMRNet++: learning discriminative features with decoupled networks and enriched pairs for one-step person search, IEEE Trans. Pattern Anal. Mach. Intell. 45 (6) (2023) 7319–7337,.
[27]
W. Dong, Z. Zhang, C. Song, T. Tan, Instance guided proposal network for person search, in: Proc. IEEE/CVF Conf. Comput Vis. Pattern Recognit. (CVPR), 2020, pp. 2585–2594,.
[28]
Y. Yan, Q. Zhang, B. Ni, W. Zhang, M. Xu, X. Yang, Learning context graph for person search, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 2158–2167,.
[29]
Z. Song, C. Zhao, G. Hu, D. Miao, Learning scene-pedestrian graph for end-to-end person search, IEEE Trans. Ind. Inform. 20 (2) (2024) 2979–2990,.
[30]
C. Yan, B. Gong, Y. Wei, Y. Gao, Deep multi-view enhancement hashing for image retrieval, IEEE Trans. Pattern Anal. Mach. Intell. 43 (4) (2020) 1445–1451,.
[31]
Y. Li, D. Miao, H. Zhang, J. Zhou, C. Zhao, Multi-granularity cross transformer network for person re-identification, Pattern Recognit. 150 (2024),.
[32]
Y. Li, Y. Liu, H. Zhang, C. Zhao, Z. Wei, D. Miao, Occlusion-aware transformer with second-order attention for person re-identification, IEEE Trans. Image Process. 33 (2024) 3200–3211,.
[33]
Y. Sun, L. Zheng, Y. Yang, Q. Tian, S. Wang, Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline), in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 480–496,.
[34]
G. Wang, Y. Yuan, X. Chen, J. Li, X. Zhou, Learning discriminative features with multiple granularities for person re-identification, in: Proc. 26th ACM Int. Conf. Multimedia (ACM Multimedia), 2018, pp. 274–282,.
[35]
K. Zhou, Y. Yang, A. Cavallaro, T. Xiang, Omni-scale feature learning for person re-identification, in: Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2019, pp. 3702–3712,.
[36]
Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, ECA-Net: efficient channel attention for deep convolutional neural networks, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 11534–11542,.
[37]
S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, CBAM: convolutional block attention module, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19,.
[38]
Q. Hou, D. Zhou, J. Feng, Coordinate attention for efficient mobile network design, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 13713–13722,.
[39]
C. Yan, Y. Hao, L. Li, J. Yin, A. Liu, Z. Mao, Z. Chen, X. Gao, Task-adaptive attention for image captioning, IEEE Trans. Circuits Syst. Video Technol. 32 (1) (2021) 43–51,.
[40]
A. Dosovitskiy, L. Beyer, et al., An image is worth 16x16 words: transformers for image recognition at scale, in: Proc. Int. Conf. Learn. Representations (ICLR), 2021,.
[41]
C. Yan, L. Meng, L. Li, J. Zhang, Z. Wang, J. Yin, J. Zhang, Y. Sun, B. Zheng, Age-invariant face recognition by multi-feature fusionand decomposition with self-attention, ACM Trans. Multimed. Comput. Commun. Appl. 18 (1s) (2022) 1–18,.
[42]
K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2961–2969,.
[43]
P.-T. Jiang, Q. Hou, Y. Cao, M.-M. Cheng, Y. Wei, H.-K. Xiong, Integral object mining via online attention accumulation, in: Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2019, pp. 2070–2079,.
[44]
L. Yang, R.-Y. Zhang, L. Li, X. Xie, SimAM: a simple, parameter-free attention module for convolutional neural networks, in: Proc. Int. Conf. Mach. Learn. (ICML), 2021, pp. 11863–11874. http://proceedings.mlr.press/v139/yang21o.html.
[45]
J. Cao, Y. Pang, R.M. Anwer, H. Cholakkal, J. Xie, M. Shah, F.S. Khan, PSTR: end-to-end one-step person search with transformers, in: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 9458–9467,.
[46]
X. Chang, P.-Y. Huang, Y.-D. Shen, X. Liang, Y. Yang, A.G. Hauptmann, RCAA: relational context-aware agents for person search, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 84–100,.
[47]
D. Chen, S. Zhang, W. Ouyang, J. Yang, B. Schiele, Hierarchical online instance matching for person search, in: Proc. AAAI Conf. Artif. Intell. (AAAI), 2020, pp. 10518–10525,.
[48]
I.O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, et al., MLP-Mixer: an all-mlp architecture for vision, in: Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2021, pp. 24261–24272.
[49]
S. Liu, D. Huang, et al., Receptive field block net for accurate and fast object detection, in: Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 385–400,.
[50]
Q. Zhang, L. Cao, C. Shi, Z. Niu, Neural time-aware sequential recommendation by jointly modeling preference dynamics and explicit feature couplings, IEEE Trans. Neural Netw. Learn. Syst. 33 (2022) 5125–5137,.

Index Terms

  1. Attentive multi-granularity perception network for person search
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Information Sciences: an International Journal
    Information Sciences: an International Journal  Volume 681, Issue C
    Oct 2024
    1022 pages

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 18 October 2024

    Author Tags

    1. Person search
    2. Person re-identification
    3. Multi-granularity
    4. Feature mixer

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media