Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Looking beyond appearances: : Synthetic training data for deep CNNs in re-identification

Published: 01 February 2018 Publication History

Highlights

A new synthetic dataset for re-identification is presented.
Fine tuning of re-id DNN with synthetic data is proposed.
Synthetic data experiments indicates re-id performance increase with other datasets.

Abstract

Re-identification is generally carried out by encoding the appearance of a subject in terms of outfit, suggesting scenarios where people do not change their attire. In this paper we overcome this restriction, by proposing a framework based on a deep convolutional neural network, SOMAnet, that additionally models other discriminative aspects, namely, structural attributes of the human figure (e.g. height, obesity, gender). Our method is unique in many respects. First, SOMAnet is based on the Inception architecture, departing from the usual siamese framework. This spares expensive data preparation (pairing images across cameras) and allows the understanding of what the network learned. Second, and most notably, the training data consists of a synthetic 100K instance dataset, SOMAset, created by photorealistic human body generation software. SOMAset will be released with a open source license to enable further developments in re-identification. Synthetic data represents a cost-effective way of acquiring semi-realistic imagery (full realism is usually not required in re-identification since surveillance cameras capture low-resolution silhouettes), while at the same time providing complete control of the samples in terms of ground truth. Thus it is relatively easy to customize the data w.r.t. the surveillance scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on recent re-identification benchmarks, matches subjects even with different apparel.

References

[1]
E. Ahmed, M. Jones, T.K. Marks, An improved deep learning architecture for person re-identification, Proc. CVPR, 2015.
[2]
S. Bak, F. Brémond, Re-identification by Covariance Descriptors, Person Re-Identification, Springer, 2014, pp. 71–91.
[3]
I.B. Barbosa, M. Cristani, A.D. Bue, L. Bazzani, V. Murino, Re-identification with rgb-d sensors, Proc. ECCV - Workshops and Demonstrations, 2012.
[4]
A. Bedagkar-Gala, S.K. Shah, A survey of approaches and trends in person re-identification, Image Vis. Comput. 32 (4) (2014) 270–286.
[5]
A. Borji, S. Izadi, L. Itti, ilab-20m: a large-scale controlled object dataset to investigate deep learning, Proc. CVPR, 2016.
[6]
M. Buhrmester, T. Kwang, S.D. Gosling, Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data?, Perspect. Psychol. Sci. 6 (1) (2011) 3–5.
[7]
A. Chakraborty, A. Das, A.K. Roy-Chowdhury, Network Consistent Data Association, Trans. PAMI 38 (2016) 1859–1871,.
[8]
D. Chen, Z. Yuan, B. Chen, N. Zheng, Similarity learning with spatial constraints for person re-identification, Proc. CVPR, 2016.
[9]
W. Chen, X. Chen, J. Zhang, K. Huang, A multi-task deep network for person re-identification, Assoc. Adv. Artif. Intell. (AAAI) (2017) 3988–3994.
[10]
Chen, X., Gupta, A., 2015. Webly supervised learning of convolutional networks. ArXiv preprint. Arxiv:1505.01554.
[11]
D.S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino, Custom pictorial structures for re-identification, Proc. BMVC, 2011.
[12]
D.S. Cheng, F. Setti, N. Zeni, R. Ferrario, M. Cristani, Semantically-driven automatic creation of training sets for object recognition, Comput. Vis. Image Understanding 131 (2015) 56–71.
[13]
Cmu graphics lab motion capture database,. http://www.mocap.cs.cmu.edu/, accessed: 2015-09-30.
[14]
A. Das, A. Chakraborty, A.K. Roy-Chowdhury, Consistent re-identification in a camera network, Proc. ECCV, 2014.
[15]
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, Decaf: A deep convolutional activation feature for generic visual recognition, Proceedings of the 31th International Conference on Machine Learning, ICML, 2014.
[16]
D. Erhan, Y. Bengio, A. Courville, P. Vincent, Visualizing higher-layer features of a deep network, Tech. Rep. 4323, Dept. IRO, Université de Montréal, 2009.
[17]
M. Farenzena, L. Bazzani, A. Perina, V. Murino, M. Cristani, Person re-identification by symmetry-driven accumulation of local features, Proc. CVPR, 2010.
[18]
R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, Proc. CVPR, 2014.
[19]
X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Aistats 9 (2010) 249–256.
[20]
X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, Aistats (2011).
[21]
X. Glorot, A. Bordes, Y. Bengio, Domain adaptation for large-scale sentiment classification: a deep learning approach, Proc. ICML, 2011.
[22]
S. Gong, M. Cristani, S. Yan, C.C. Loy, Person Re-identification, 1, Springer, 2014.
[23]
D. Gray, H. Tao, Viewpoint invariant pedestrian recognition with an ensemble of localized features, Proc. ECCV, 2008.
[24]
G. Hinton, J.L. McClelland, D.E. Rumelhart, Distributed Representations, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1986.
[25]
O. Huynh, B. Stanciulescu, Person re-identification using the silhouette shape described by a point distribution model, Proc. WACV, 2015.
[26]
S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proc. ICML, 2015.
[27]
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, Proceedings of the ACM International Conference on Multimedia, ACM, 2014, pp. 675–678.
[28]
M. Köstinger, M. Hirzer, P. Wohlhart, P.M. Roth, H. Bischof, Large scale metric learning from equivalence constraints, Proc. CVPR, 2012.
[29]
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems 25, Curran Associates, Inc., 2012, pp. 1097–1105.
[30]
C.H. Lampert, H. Nickisch, S. Harmeling, Learning to detect unseen object classes by between-class attribute transfer, Proc. CVPR, 2009.
[31]
W. Li, R. Zhao, T. Xiao, X. Wang, Deepreid: Deep filter pairing neural network for person re-identification, Proc. CVPR, 2014.
[32]
S. Liao, Y. Hu, X. Zhu, S.Z. Li, Person re-identification by local maximal occurrence representation and metric learning, Proc. CVPR, 2015.
[33]
S. Liao, S.Z. Li, Efficient psd constrained asymmetric metric learning for person re-identification, Proc. ICCV, 2015.
[34]
T.-Y. Lin, A. RoyChowdhury, S. Maji, Bilinear cnn models for fine-grained visual recognition, Proc. ICCV, 2015.
[35]
N. Martinel, A. Das, C. Micheloni, A.K. Roy-Chowdhury, Re-identification in the function space of feature warps, Trans. PAMI 37 (8) (2015) 1656–1669,.
[36]
N. McLaughlin, J.M.D. Rincon, P. Miller, Data-augmentation for reducing dataset bias in person re-identification, Proc. AVSS, 2015.
[37]
M. Munaro, A. Fossati, A. Basso, E. Menegatti, L.V. Gool, One-shot Person Re-identification with a Consumer Depth Camera, Person Re-Identification, Springer, 2014, pp. 161–181.
[38]
S. Paisitkriangkrai, C. Shen, A. van den Hengel, Learning to rank in person re-identification with metric ensembles, Proc. CVPR, 2015.
[39]
S.J. Pan, Q. Yang, A survey on transfer learning, knowledge and data engineering, IEEE Trans. 22 (10) (2010) 1345–1359.
[40]
X. Peng, B. Sun, K. Ali, K. Saenko, Learning deep object detectors from 3d models, Proc. ICCV, 2015.
[41]
T. Plate, Distributed representations, Encyclopedia of Cognitive Science, 2006.
[42]
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning Representations by Back-propagating Errors, Neurocomputing: Foundations of Research, 1988, pp. 696–699.
[43]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A.C. Berg, L. Fei-Fei, Imagenet large scale visual recognition challenge, IJCV 115 (3) (2015) 211–252.
[44]
W.H. Sheldon, S.S. Stevens, W.B. Tucker, The Varieties of Human Physique, Harper, 1940.
[45]
Simonyan, K., Vedaldi, A., Zisserman, A., 2013. Deep inside convolutional networks: visualising image classification models and saliency maps. ArXiv preprint. Arxiv:1312.6034.
[46]
Simonyan, K., Zisserman, A., 2015. Very deep convolutional networks for large-scale image recognition. ArXiv preprint. Arxiv:1409.1556.
[47]
C. Su, S. Zhang, J. Xing, W. Gao, Q. Tian, Deep attributes driven multi-camera person re-identification, Proc. ECCV, 2016.
[48]
B. Sun, K. Saenko, From virtual to reality: fast adaptation of virtual object detectors to real domains, Proc. BMVC, 2014.
[49]
Y. Sun, Y. Chen, X. Wang, X. Tang, Deep Learning Face Representation by Joint Identification-verification, in: Ghahramani Z., Welling M., Cortes C., Lawrence N.D., Weinberger K.Q. (Eds.), Advances in Neural Information Processing Systems 27, Curran Associates, Inc., 2014, pp. 1988–1996.
[50]
I. Sutskever, J. Martens, G.E. Dahl, G.E. Hinton, On the importance of initialization and momentum in deep learning, Proc. ICML, 2013.
[51]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, Proc. CVPR, 2015.
[52]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, Proc. CVPR, 2016.
[53]
Ustinova, E., Ganin, Y., Lempitsky, V.S., 2016. Multiregion bilinear convolutional neural networks for person re-identification. ArXiv preprint. Arxiv:1512.05300.
[54]
A.B. Valdez, M.H. Papesh, D.M. Treiman, K.A. Smith, S.D. Goldinger, P.N. Steinmetz, Distributed representation of visual objects by single neurons in the human brain, J. Neurosci. 35 (13) (2015) 5180–5186.
[55]
R.R. Varior, M. Haloi, G. Wang, Gated siamese convolutional neural network architecture for human re-identification, Proc. ECCV, 2016.
[56]
R.R. Varior, B. Shuai, J. Lu, D. Xu, G. Wang, A siamese long short-term memory architecture for human re-identification, Proc. ECCV, 2016.
[57]
L. Wu, C. Shen, A. van den Hengel, Personnet: Person re-identification with deep convolutional neural networks, CoRR, 2016.
[58]
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, J. Xiao, 3dshapenets: a deep representation for volumetric shapes, Proc. CVPR, 2015.
[59]
Y. Xia, X. Cao, F. Wen, J. Sun, Well begun is half done: Generating high-quality seeds for automatic image dataset construction from web, Proc. ECCV, 2014.
[60]
T. Xiao, H. Li, W. Ouyang, X. Wang, Learning deep feature representations with domain guided dropout for person re-identification, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[61]
F. Xiong, M. Gou, O. Camps, M. Sznaier, Person re-identification using kernel-based metric learning methods, Proc. ECCV, Springer, 2014.
[62]
J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How Transferable are Features in Deep Neural Networks?, Advances in Neural Information Processing Systems, 2014, pp. 3320–3328.
[63]
J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, H. Lipson, Understanding neural networks through deep visualization, Deep Learning Workshop, International Conference on Machine Learning (ICML), 2015.
[64]
M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, Proc. ECCV, 2014.
[65]
L. Zhang, T. Xiang, S. Gong, Learning a discriminative null space for person re-identification, Proc. CVPR, 2016.
[66]
Zhang, X., Fu, Y., Zang, A., Sigal, L., Agam, G., 2016. Learning classifiers from synthetic data using a multichannel autoencoder. ArXiv preprint. Arxiv:1503.03163.
[67]
Z. Zhang, Y. Chen, V. Saligrama, Group membership prediction, Proc. ICCV, 2015.
[68]
L. Zhseng, L. Shen, L. Tian, S. Wang, J. Wang, Q. Tian, Scalable person re-identification: a benchmark, Proc. ICCV, 2015.

Cited By

View all
  • (2024)Human Image Generation: A Comprehensive SurveyACM Computing Surveys10.1145/366586956:11(1-39)Online publication date: 28-Jun-2024
  • (2024)Synthetic Data for Deep Learning in Computer Vision & Medical Imaging: A Means to Reduce Data BiasACM Computing Surveys10.1145/366375956:11(1-37)Online publication date: 28-Jun-2024
  • (2024)Synthetic Data in Human Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336282146:7(4957-4976)Online publication date: 6-Feb-2024
  • Show More Cited By

Index Terms

  1. Looking beyond appearances: Synthetic training data for deep CNNs in re-identification
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Computer Vision and Image Understanding
          Computer Vision and Image Understanding  Volume 167, Issue C
          Feb 2018
          153 pages

          Publisher

          Elsevier Science Inc.

          United States

          Publication History

          Published: 01 February 2018

          Author Tags

          1. Re-identification
          2. Deep learning
          3. Training set
          4. Automated training dataset generation
          5. Re-identification photorealistic dataset

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 16 Nov 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Human Image Generation: A Comprehensive SurveyACM Computing Surveys10.1145/366586956:11(1-39)Online publication date: 28-Jun-2024
          • (2024)Synthetic Data for Deep Learning in Computer Vision & Medical Imaging: A Means to Reduce Data BiasACM Computing Surveys10.1145/366375956:11(1-37)Online publication date: 28-Jun-2024
          • (2024)Synthetic Data in Human Analysis: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336282146:7(4957-4976)Online publication date: 6-Feb-2024
          • (2024)Weakly Supervised Joint Transfer and Regression of Textures for 3-D Human ReconstructionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.337317870:1(4400-4410)Online publication date: 5-Mar-2024
          • (2024)Exert Diversity and Mitigate Bias: Domain Generalizable Person Re-identification with a Comprehensive BenchmarkInternational Journal of Computer Vision10.1007/s11263-024-02124-5132:11(5124-5150)Online publication date: 1-Nov-2024
          • (2024)Synthetic Data for Video Surveillance Applications of Computer Vision: A ReviewInternational Journal of Computer Vision10.1007/s11263-024-02102-x132:10(4473-4509)Online publication date: 1-Oct-2024
          • (2024)Same-clothes person re-identification with dual-stream networkMultimedia Systems10.1007/s00530-024-01269-030:2Online publication date: 26-Feb-2024
          • (2023)Less Is More: Learning from Synthetic Data with Fine-Grained Attributes for Person Re-IdentificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358844119:5s(1-20)Online publication date: 7-Jun-2023
          • (2023)Frame-Recurrent Video Crowd CountingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.325094633:9(5186-5199)Online publication date: 1-Mar-2023
          • (2023)COCAS+: Large-Scale Clothes-Changing Person Re-Identification With Clothes TemplatesIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321676933:4(1839-1853)Online publication date: 1-Apr-2023
          • Show More Cited By

          View Options

          View options

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media