End-to-End Detection and Re-identification Integrated Net for Person Search

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11362))

Included in the following conference series:

Asian Conference on Computer Vision

2295 Accesses
9 Citations

Abstract

This paper proposes a pedestrian detection and re-identification (re-id) integrated net (I-Net) in an end-to-end learning framework. The I-Net is used in real-world video surveillance scenarios, where the target person needs to be searched in the whole scene videos, and the annotations of pedestrian bounding boxes are unavailable. Comparing to the successful OIM method [31] for joint detection and re-id, we have three distinct contributions. First, we implement a Siamese architecture instead of one stream for an end-to-end training strategy. Second, a novel on-line pairing loss (OLP) with a feature dictionary restricts the positive pairs. Third, hard example priority softmax loss (HEP) with little computation cast is proposed to deal with the online hard example mining. We show our results on CUHK-SYSU and PRW datasets. Our method narrows the gap between detection and re-identification, and achieves a superior performance.

This work is supported by National Natural Science Fund of China (Grant 61771079) and Chongqing Science and Technology Project (cstc2017zdcy-zdzxX0002).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improved Model Structure with Cosine Margin OIM Loss for End-to-End Person Search

Partially Separated Networks for Person Search

Person re-identification in the real scene based on the deep learning

Article 20 July 2021

References

Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: Computer Vision and Pattern Recognition, pp. 3908–3916 (2015)
Google Scholar
Cao, C., Wang, Y., Kato, J., Zhang, G., Mase, K.: Solving occlusion problem in pedestrian detection by constructing discriminative part layers. In: Applications of Computer Vision, pp. 91–99 (2017)
Google Scholar
Chen, S.Z., Guo, C.C., Lai, J.H.: Deep ranking for person re-identification via joint representation learning. IEEE Trans. Image Process. 25(5), 2353–2367 (2016)
Article MathSciNet Google Scholar
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification (2017)
Google Scholar
Chen, W., Chen, X., Zhang, J., Huang, K.: A multi-task deep network for person re-identification (2017)
Google Scholar
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)
Google Scholar
Ding, S., Lin, L., Wang, G., Chao, H.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recognit. 48(10), 2993–3003 (2015)
Article Google Scholar
Dollar, P., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532 (2014)
Article Google Scholar
Dollar, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: British Machine Vision Conference, BMVC 2009, London, UK, 7–10 September 2009, Proceedings (2009)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Girshick, R.: Fast R-CNN. Computer Science (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia, pp. 675–678 (2014)
Google Scholar
Li, W., Wang, X.: Locally aligned feature transforms across views. In: Computer Vision and Pattern Recognition, pp. 3594–3601 (2013)
Google Scholar
Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 31–44. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37331-2_3
Chapter Google Scholar
Li, X., Zheng, W.S., Wang, X., Xiang, T.: Multi-scale learning for low-resolution person re-identification. In: IEEE International Conference on Computer Vision, pp. 3765–3773 (2015)
Google Scholar
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Google Scholar
Liu, H., et al.: Neural person search machines (2017)
Google Scholar
Liu, J., et al.: Multi-scale triplet CNN for person re-identification. In: ACM on Multimedia Conference, pp. 192–196 (2016)
Google Scholar
Nam, W., Dollar, P., Han, J.H.: Local decorrelation for improved pedestrian detection. In: NIPS, pp. 1–9 (2014)
Google Scholar
Ouyang, W., Zeng, X., Wang, X.: Modeling mutual visibility relationship in pedestrian detection. In: Computer Vision and Pattern Recognition, pp. 3222–3229 (2013)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Roth, P.M., Wohlhart, P., Hirzer, M., Kostinger, M., Bischof, H.: Large scale metric learning from equivalence constraints. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295 (2012)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering, pp. 815–823 (2015)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Google Scholar
Song, B., Kamal, A.T., Soto, C., Ding, C., Farrell, J.A., Roychowdhury, A.K.: Tracking and activity recognition through consensus in distributed camera networks. IEEE Trans. Image Process. 19(10), 2564–2579 (2010)
Article MathSciNet Google Scholar
Song, H., Wang, W., Wang, J., Wang, R.: Collaborative deep networks for pedestrian detection. In: IEEE Third International Conference on Multimedia Big Data, pp. 146–153 (2017)
Google Scholar
Tian, Y., Luo, P., Wang, X., Tang, X.: Pedestrian detection aided by deep learning semantic tasks, pp. 5079–5087 (2014)
Google Scholar
Wang, X.: Intelligent multi-camera video surveillance: a review. Pattern Recognit. Lett. 34, 3–19 (2013)
Article Google Scholar
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: End-to-end deep learning for person search (2016)
Google Scholar
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: IEEE International Conference on Computer Vision, pp. 82–90 (2015)
Google Scholar
Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_28
Chapter Google Scholar
Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection, pp. 1751–1760 (2015)
Google Scholar
Zhao, R., Ouyang, W., Wang, X.: Unsupervised salience learning for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3586–3593 (2013)
Google Scholar
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)
Google Scholar
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild, pp. 3346–3355 (2016)
Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: A discriminatively learned CNN embedding for person re-identification (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Microelectronics and Communication Engineering, Chongqing University, No. 174 Shazheng Street, Shapingba District, Chongqing, 400044, China
Zhenwei He & Lei Zhang

Authors

Zhenwei He
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Zhang .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Z., Zhang, L. (2019). End-to-End Detection and Re-identification Integrated Net for Person Search. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11362. Springer, Cham. https://doi.org/10.1007/978-3-030-20890-5_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-20890-5_23
Published: 02 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20889-9
Online ISBN: 978-3-030-20890-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

End-to-End Detection and Re-identification Integrated Net for Person Search

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improved Model Structure with Cosine Margin OIM Loss for End-to-End Person Search

Partially Separated Networks for Person Search

Person re-identification in the real scene based on the deep learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

End-to-End Detection and Re-identification Integrated Net for Person Search

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improved Model Structure with Cosine Margin OIM Loss for End-to-End Person Search

Partially Separated Networks for Person Search

Person re-identification in the real scene based on the deep learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation