Nothing Special   »   [go: up one dir, main page]

skip to main content
rapid-communication

Adaptive deep metric embeddings for person re-identification under occlusions

Published: 07 May 2019 Publication History

Abstract

Person re-identification (ReID) under occlusions is a challenging problem in video surveillance. Most of existing person ReID methods take advantage of local features to deal with occlusions. However, these methods usually independently extract features from the local regions of an image without considering the relationship among different local regions. In this paper, we propose a novel person ReID method, which extracts the discriminative feature representation of the pedestrian image based on Long Short-Term Memory (LSTM), dealing with the problem of occlusions. In particular, the multi-directional spatial encoded local features are developed to learn the spatial dependencies between the local regions by taking advantage of LSTM. Furthermore, we propose a novel loss (termed the adaptive nearest neighbor loss) based on the classification uncertainty to effectively reduce intra-class variations while enlarging inter-class differences within the adaptive neighborhood of the sample. The proposed loss enables the deep neural network to adaptively learn discriminative metric embeddings, which significantly improve the generalization capability of recognizing unseen person identities. Extensive comparative evaluations on challenging person ReID datasets demonstrate the significantly improved performance of the proposed method compared with several state-of-the-art methods.

References

[1]
Y. Lin, Guo F., Cao L., Wang J., Person re-identification based on multi-instance multi-label learning, Neurocomputing 217 (2016) 19–26.
[2]
Huang Y., Sheng H., Zheng Y., Xiong Z., Deepdiff: learning deep difference features on human body parts for person re-identification, Neurocomputing 241 (2017) 191–203.
[3]
Fang W., Hu H.M., Hu Z., Liao S., Li B., Perceptual hash-based feature description for person re-identification, Neurocomputing 272 (2017) 520–531.
[4]
Dong H., Lu P., Zhong S., Liu C., Ji Y., Gong S., Person re-identification by enhanced local maximal occurrence representation and generalized similarity metric learning, Neurocomputing 307 (2018) 25–37.
[5]
Cheng D., Chang X., Liu L., A.G. Hauptmann, Gong Y., Zheng N., Discriminative dictionary learning with ranking metric embedded for person re-identification, Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2017, pp. 964–970.
[6]
Zheng W.S., Gong S., Xiang T., Reidentification by relative distance comparison, IEEE Trans. Pattern Anal. Mach. Intell. 35 (3) (2013) 653–668.
[7]
Yang Y., Yang J., Yan J., Liao S., Yi D., Li S.Z., Salient color names for person re-identification, Proceedings of the European Conference on Computer Vision (ECCV), 2014, pp. 536–551.
[8]
Ma B., Y. Su, Jurie F., Covariance descriptor based on bio-inspired features for person re-identification and face verification, Image Vision Comput. 32 (6) (2014) 379–390.
[9]
Liao S., Hu Y., Zhu X., Li S.Z., Person re-identification by local maximal occurrence representation and metric learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 2197–2206.
[10]
Li Z., Chang S., Liang F., Huang T.S., Cao L., Smith J.R., Learning locally-adaptive decision functions for person verification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 3610–3617.
[11]
M. Koestinger, M. Hirzer, P. Wohlhart, P.M. Roth, H. Bischof, Large scale metric learning from equivalence constraints, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 2288–2295.
[12]
S. Karanam, Gou M., Wu Z., A. Ratesborras, O. Camps, R.J. Radke, A systematic evaluation and benchmark for person re-identification: features, metrics, and datasets, IEEE Trans. Pattern Anal. Mach. Intell. 41 (3) (2019) 523–536.
[13]
Ding S., Lin L., Wang G., Chao H., Deep feature learning with relative distance comparison for person re-identification, Pattern Recognit. 48 (10) (2015) 2993–3003.
[14]
Chen W., Chen X., Zhang J., Huang K., Beyond triplet loss: a deep quadruplet network for person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1320–1329.
[15]
Ustinova E., Y. Ganin, V. Lempitsky, Multi-region bilinear convolutional neural networks for person re-identification, Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017, pp. 2993–3003.
[16]
H. Liu, Feng J., Qi M., Jiang J., Yan S., End-to-end comparative attention networks for person re-identification, IEEE Trans. Image Process. 26 (2017) 3492–3506.
[17]
Huang B., Chen J., Wang Y., Liang C., Wang Z., Sun K., Sparsity-based occlusion handling method for person re-identification, Proceedings of the International Conference on Multimedia Modeling (ICME), Springer, 2015, pp. 61–73.
[18]
Zheng W.S., Li X., Xiang T., Liao S., Lai J., Gong S., Partial person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 4678–4686.
[19]
Yi D., Lei Z., Liao S., Li S.Z., Deep metric learning for person re-identification, Proceedings of the International Conference on Pattern Recognition (ICPR), 2014, pp. 34–39.
[20]
Cheng D., Gong Y., Zhou S., Wang J., Zheng N., Person re-identification by multi-channel parts-based CNN with improved triplet loss function, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1335–1344.
[21]
Wei L., Zhang S., Yao H., Gao W., Tian Q., Glad: global-local-alignment descriptor for pedestrian retrieval, Proceedings of the 2017 ACM on Multimedia Conference, ACM, 2017, pp. 420–428.
[22]
Su C., Li J., Zhang S., Xing J., Gao W., Tian Q., Pose-driven deep convolutional model for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3980–3989.
[23]
Zhao L., Li X., Zhuang Y., Wang J., Deeply-learned part-aligned representations for person re-identification, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3239–3248.
[24]
Z. Zhong, L. Zheng, G. Kang, S. Li, Y. Yang, Random erasing data augmentation, arXiv preprint, arXiv:1708.04896, 2017.
[25]
Palangi H., Deng L., Shen Y., Gao J., He X., Chen J., Song X., Ward R., Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval, IEEE/ACM Trans. Audio Speech Lang. Process. 24 (4) (2016) 694–707.
[26]
Tao D., Lin X., Jin L., Li X., Principal component 2-d long short-term memory for font recognition on single chinese characters, IEEE Trans. Cybern. 46 (3) (2016) 756–765.
[27]
Sun L., Jia K., Chen K., Yeung D.Y., Shi B.E., S. Savarese, Lattice long short-term memory for human action recognition, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2166–2175.
[28]
R.R. Varior, B. Shuai, Lu J., Xu D., Wang G., A siamese long short-term memory architecture for human re-identification, Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 135–153.
[29]
F. Schroff, D. Kalenichenko, J. Philbin, Facenet: a unified embedding for face recognition and clustering, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.
[30]
He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[31]
A. Hermans, L. Beyer, B. Leibe, In defense of the triplet loss for person re-identification, arXiv preprint, arXiv:1703.07737, 2017.
[32]
Song H.O., Xiang Y., Jegelka S., S. Savarese, Deep metric learning via lifted structured feature embedding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4004–4012.
[33]
Zheng L., Shen L., Tian L., Wang S., Wang J., Tian Q., Scalable person re-identification: a benchmark, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1116–1124.
[34]
E. Ristani, F. Solera, Zou R., R. Cucchiara, C. Tomasi, Performance measures and a data set for multi-target, multi-camera tracking, Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 17–35.
[35]
Li W., Zhao R., Xiao T., Wang X., Deepreid: deep filter pairing neural network for person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 152–159.
[36]
Chen Y., Chen Y., Wang X., Tang X., Deep learning face representation by joint identification-verification, Proceedings of the International Conference on Neural Information Processing Systems (NIPS), 2014, pp. 1988–1996.
[37]
Zhang L., Xiang T., Gong S., Learning a discriminative null space for person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1239–1248.
[38]
L. Zheng, Y. Yang, A.G. Hauptmann, Person re-identification: past, present and future, arXiv preprint, arXiv:1610.02984, 2016.
[39]
Zhou S., Wang J., Wang J., Gong Y., Zheng N., Point to set similarity based deep feature learning for person reidentification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5028–5037.
[40]
Sun Y., Zheng L., Deng W., Wang S., Svdnet for pedestrian retrieval, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3820–3828.
[41]
Chen Y., Zhu X., Gong S., Person re-identification by deep learning multi-scale representations, Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2590–2600.

Cited By

View all

Index Terms

  1. Adaptive deep metric embeddings for person re-identification under occlusions
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Neurocomputing
        Neurocomputing  Volume 340, Issue C
        May 2019
        310 pages

        Publisher

        Elsevier Science Publishers B. V.

        Netherlands

        Publication History

        Published: 07 May 2019

        Author Tags

        1. Person re-identification
        2. Occlusion
        3. Long short-term memory
        4. Adaptive nearest neighbor loss

        Qualifiers

        • Rapid-communication

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 12 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media