Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Low-Resolution Face Recognition in the Wild via Selective Knowledge Distillation

Published: 01 April 2019 Publication History

Abstract

Typically, the deployment of face recognition models in the wild needs to identify low-resolution faces with extremely low computational cost. To address this problem, a feasible solution is compressing a complex face model to achieve higher speed and lower memory at the cost of minimal performance drop. Inspired by that, this paper proposes a learning approach to recognize low-resolution faces via selective knowledge distillation. In this approach, a two-stream convolutional neural network (CNN) is first initialized to recognize high-resolution faces and resolution-degraded faces with a teacher stream and a student stream, respectively. The teacher stream is represented by a complex CNN for high-accuracy recognition, and the student stream is represented by a much simpler CNN for low-complexity recognition. To avoid significant performance drop at the student stream, we then selectively distil the most informative facial features from the teacher stream by solving a sparse graph optimization problem, which are then used to regularize the fine-tuning process of the student stream. In this way, the student stream is actually trained by simultaneously handling two tasks with limited computational resources: approximating the most informative facial cues via feature regression, and recovering the missing facial cues via low-resolution face classification. Experimental results show that the student stream performs impressively in recognizing low-resolution faces and costs only 0.15-MB memory and runs at 418 faces per second on CPU and 9433 faces per second on GPU.

References

[1]
Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace: Closing the gap to human-level performance in face verification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, pp. 1701–1708.
[2]
Y. Sun, X. Wang, and X. Tang, “Deep learning face representation from predicting 10,000 classes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, pp. 1891–1898.
[3]
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 815–823.
[4]
O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” in Proc. Brit. Mach. Vis. Conf. (BMVC), 2015, vol. 1, no. 3, p. 6.
[5]
B. Amos, B. Ludwiczuk, and M. Satyanarayanan, “OpenFace: A general-purpose face recognition library with mobile applications,” Carnegie Mellon School Comput. Sci., Pittsburgh, PA, USA, 2016.
[6]
A. Pentland and T. Choudhury, “Face recognition for smart environments,” Computer, vol. 33, no. 2, pp. 50–55, Feb. 2000.
[7]
D. Liu, B. Cheng, Z. Wang, H. Zhang, and T. S. Huang. (2017). “Enhance visual recognition under adverse conditions via deep networks.” [Online]. Available: https://arxiv.org/abs/1712.07732
[8]
S. Kolouri and G. K. Rohde, “Transport-based single frame super resolution of very low resolution face images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 4876–4884.
[9]
M. Jian and K.-M. Lam, “Simultaneous hallucination and recognition of low-resolution faces based on singular value decomposition,” IEEE Trans. Circuits Syst. Video Technol., vol. 25, no. 11, pp. 1761–1772, Nov. 2015.
[10]
M.-C. Yang, C.-P. Wei, Y.-R. Yeh, and Y.-C. F. Wang, “Recognition at a long distance: Very low resolution face recognition and hallucination,” in Proc. Int. Conf. Biometrics (ICB), May 2015, pp. 237–242.
[11]
S. Biswas, K. W. Bowyer, and P. J. Flynn, “Multidimensional scaling for matching low-resolution face images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 10, pp. 2019–2030, Oct. 2012.
[12]
C.-X. Ren, D.-Q. Dai, and H. Yan, “Coupled kernel embedding for low-resolution face image recognition,” IEEE Trans. Image Process., vol. 21, no. 8, pp. 3770–3783, Aug. 2012.
[13]
T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local binary patterns,” in Proc. Eur. Conf. Comput. Vis. (ECCV). Berlin, Germany: Springer, 2004, pp. 469–481.
[14]
S. J. D. Prince and J. H. Elder, “Probabilistic linear discriminant analysis for inferences about identity,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2007, pp. 1–8.
[15]
Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 499–515.
[16]
A. T. Tran, T. Hassner, I. Masi, and G. Medioni, “Regressing robust and discriminative 3D morphable models with a very deep neural network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2017, pp. 1493–1502.
[17]
X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, “Range loss for deep face recognition with long-tail,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Jun. 2017, pp. 5409–5418.
[18]
G. Huet al., “Attribute-enhanced face recognition with neural tensor fusion networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Jun. 2017, pp. 3764–3773.
[19]
Y. Sun, Y. Chen, X. Wang, and X. Tang, “Deep learning face representation by joint identification-verification,” in Proc. Adv. Neural Inf. Process. Syst., vol. 27. 2014, pp. 1988–1996.
[20]
Y. Sun, D. Liang, X. Wang, and X. Tang. (2015). “DeepID3: Face recognition with very deep neural networks.” [Online]. Available: https://arxiv.org/abs/1502.00873
[21]
Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, “VGGFace2: A dataset for recognising faces across pose and age,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), 2018, pp. 67–74.
[22]
T. Uiboupin, P. Rasti, G. Anbarjafari, and H. Demirel, “Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring,” in Proc. IEEE Conf. Signal Process. Commun. Appl., May 2016, pp. 437–440.
[23]
C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution,” in Proc. Eur. Conf. Comput. Vis. (ECCV). Berlin, Germany: Springer, 2014, pp. 184–199.
[24]
C. Lediget al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2017, pp. 105–114.
[25]
W. W. W. Zou and P. C. Yuen, “Very low resolution face recognition problem,” IEEE Trans. Image Process., vol. 21, no. 1, pp. 327–340, Jan. 2012.
[26]
P. Zhang, X. Ben, W. Jiang, R. Yan, and Y. Zhang, “Coupled marginal discriminant mappings for low-resolution face recognition,” Opt.-Int. J. Light Electron Opt., vol. 126, no. 23, pp. 4352–4357, Dec. 2015.
[27]
J. Jiang, R. Hu, Z. Wang, and Z. Cai, “CDMMA: Coupled discriminant multi-manifold analysis for matching low-resolution face images,” Signal Process., vol. 124, pp. 162–172, 2016.
[28]
X. Wang, H. Hu, and J. Gu, “Pose robust low-resolution face recognition via coupled kernel-based enhanced discriminant analysis,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 2, pp. 203–212, Apr. 2016.
[29]
X. Xing and K. Wang, “Couple manifold discriminant analysis with bipartite graph embedding for low-resolution face recognition,” Signal Process., vol. 125, pp. 329–335, Aug. 2016.
[30]
J. Shi and C. Qi, “From local geometry to global structure: Learning latent subspace for low-resolution face image recognition,” IEEE Signal Process. Lett., vol. 22, no. 5, pp. 554–558, May 2015.
[31]
M. Haghighat and M. Abdel-Mottaleb, “Low resolution face recognition in surveillance systems using discriminant correlation analysis,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), May/Jun. 2017, pp. 912–917.
[32]
X. Li, W.-S. Zheng, X. Wang, T. Xiang, and S. Gong, “Multi-scale learning for low-resolution person re-identification,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 3765–3773.
[33]
K.-H. Pong and K.-M. Lam, “Multi-resolution feature fusion for face recognition,” Pattern Recognit., vol. 47, no. 2, pp. 556–567, Feb. 2014.
[34]
S. P. Mudunuri and S. Biswas, “Low resolution face recognition across variations in pose and illumination,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 5, pp. 1034–1040, May 2016.
[35]
S. Shekhar, V. M. Patel, and R. Chellappa. (2017). “Synthesis-based robust low resolution face recognition.” [Online]. Available: https://arxiv.org/abs/1707.02733
[36]
Z. Wang, S. Chang, Y. Yang, D. Liu, and T. S. Huang, “Studying very low resolution recognition using deep networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 4792–4800.
[37]
C. Herrmann, D. Willersinn, and J. Beyerer, “Low-resolution convolutional neural networks for video face recognition,” in Proc. IEEE Int. Conf. Adv. Video Signal Based Surveill. (AVSS), Aug. 2016, pp. 221–227.
[38]
C. Buciluǎ, R. Caruana, and A. Niculescu-Mizil, “Model compression,” in Proc. ACM Conf. Knowl. Discovery Data Mining (KDD), 2006, pp. 535–541.
[39]
G. Hinton, J. Dean, and O. Vinyals, “Distilling the knowledge in a neural network,” in Proc. Neural Inf. Process. Syst. (NIPS) Workshop, 2014, pp. 1–9.
[40]
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “FitNets: Hints for thin deep nets,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2015, pp. 1–13.
[41]
T. Chen, I. Goodfellow, and J. Shlens, “Net2Net: Accelerating learning via knowledge transfer,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2016.
[42]
P. Luo, Z. Zhu, Z. Liu, X. Wang, and X. Tang, “Face model compression by distilling knowledge from neurons,” in Proc. AAAI Conf. Artif. Intell. (AAAI), 2016, pp. 3560–3566.
[43]
Z. Li and D. Hoiem, “Learning without forgetting,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 614–629.
[44]
Y. Kim and A. M. Rush, “Sequence-level knowledge distillation,” in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), 2016, pp. 1317–1327.
[45]
G. Urbanet al., “Do deep convolutional nets really need to be deep and convolutional?” in Proc. Int. Conf. Learn. Represent. (ICLR), 2017, pp. 1–13.
[46]
G. Chen, W. Choi, X. Yu, T. Han, and M. Chandraker, “Learning efficient object detection models with knowledge distillation,” in Proc. Neural Inf. Process. Syst. (NIPS), 2017, pp. 742–751.
[47]
Y. Chen, N. Wang, and Z. Zhang, “DarkRank: Accelerating deep metric learning via cross sample similarities transfer,” in Proc. AAAI Conf. Artif. Intell. (AAAI), 2018, pp. 2852–2859.
[48]
T. Chen, L. Lin, W. Zuo, X. Luo, and L. Zhang, “Learning a wavelet-like auto-encoder to accelerate deep neural networks,” in Proc. AAAI Conf. Artif. Intell. (AAAI), 2018, pp. 6722–6729.
[49]
G. Zhou, Y. Fan, R. Cui, W. Bian, X. Zhu, and G. Kun, “Rocket launching: A universal and efficient framework for training well-performing light net,” in Proc. AAAI Conf. Artif. Intell. (AAAI), 2018, pp. 1–8.
[50]
D. Lopezpaz, L. Bottou, B. Scholkopf, and V. Vapnik, “Unifying distillation and privileged information,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2016, pp. 1–10.
[51]
J.-C. Su and S. Maji, “Adapting models to signal degradation using distillation,” in Proc. Brit. Mach. Vis. Conf. (BMVC), 2017, pp. 1–14.
[52]
I. Radosavovic, P. Dollár, R. Girshick, G. Gkioxari, and K. He, “Data distillation: Towards omni-supervised learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, pp. 4119–4128.
[53]
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2015, pp. 1–14.
[54]
C. Szegedyet al., “Going deeper with convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1–9.
[55]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.
[56]
J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2017, p. 7263–7271.
[57]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Neural Inf. Process. Syst. (NIPS), 2012, pp. 1097–1105.
[58]
Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 9, pp. 1124–1137, Sep. 2004.
[59]
A. Bansal, A. Nanduri, C. D. Castillo, R. Ranjan, and R. Chellappa, “UMDFaces: An annotated face dataset for training deep networks,” in Proc. IEEE Int. Joint Conf. Biometrics (IJCB), Oct. 2017, pp. 464–473.
[60]
E. Learned-Miller, G. B. Huang, A. Roychowdhury, H. Li, and G. Hua, “Labeled faces in the wild: A survey,” in Advances in Face Detection and Facial Image Analysis. Cham, Switzerland: Springer, 2016, pp. 189–248.
[61]
A. Sapkota and T. E. Boult, “Large scale unconstrained open set face database,” in Proc. IEEE 6th Int. Conf. Biometrics, Theory, Appl. Syst., Sep./Oct. 2014, pp. 1–8.
[62]
M. Grgic, K. Delac, and S. Grgic, “SCface—Surveillance cameras face database,” Multimedia Tools Appl., vol. 51, no. 3, pp. 863–879, 2011.
[63]
M. Abadiet al., “TensorFlow: A system for large-scale machine learning,” in Proc. USENIX Symp. Operating Syst. Design Implement. (OSDI), vol. 16. 2016, pp. 265–283.
[64]
S. Ren, X. Cao, Y. Wei, and J. Sun, “Face alignment via regressing local binary features,” IEEE Trans. Image Process., vol. 25, no. 3, pp. 1233–1245, Mar. 2016.
[65]
D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun, “Bayesian face revisited: A joint formulation,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2012, pp. 566–579.
[66]
L. van der Maaten and G. Hinton, “Visualizing high-dimensional data using t-SNE,” J. Mach. Learn. Res., vol. 9, pp. 2579–2605, Nov. 2008.
[67]
B. Cheng, D. Liu, Z. Wang, H. Zhang, and T. S. Huang, “Visual recognition in very low-quality settings: Delving into the power of pre-training,” in Proc. AAAI Conf. Artif. Intell. (AAAI), 2018, pp. 8065–8066.
[68]
S. P. Mudunuri, S. Venkataramanan, and S. Biswas, “Dictionary alignment with re-ranking for low-resolution NIR-VIS face recognition,” IEEE Trans. Inf. Forensics Security, vol. 14, no. 4, pp. 886–896, Apr. 2019.
[69]
M. Singh, S. Nagpal, M. Vatsa, R. Singh, and A. Majumdar, “Identity aware synthesis for cross resolution face recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2018, pp. 479–488.
[70]
P. Li, L. Prieto, D. Mery, and P. Flynn. (2018). “Low resolution face recognition in the wild.” [Online]. Available: https://arxiv.org/abs/1805.11529

Cited By

View all
  • (2024)Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object DiscriminationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338060446:9(5967-5985)Online publication date: 1-Sep-2024
  • (2024)Learning From Human Educational Wisdom: A Student-Centered Knowledge Distillation MethodIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335492846:6(4188-4205)Online publication date: 16-Jan-2024
  • (2024)Learning Shape-Biased Representations for Infrared Small Target DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.332574326(4681-4692)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing
IEEE Transactions on Image Processing  Volume 28, Issue 4
April 2019
528 pages

Publisher

IEEE Press

Publication History

Published: 01 April 2019

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object DiscriminationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338060446:9(5967-5985)Online publication date: 1-Sep-2024
  • (2024)Learning From Human Educational Wisdom: A Student-Centered Knowledge Distillation MethodIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335492846:6(4188-4205)Online publication date: 16-Jan-2024
  • (2024)Learning Shape-Biased Representations for Infrared Small Target DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.332574326(4681-4692)Online publication date: 1-Jan-2024
  • (2024)Semi-Supervised Single-Image Dehazing Network via Disentangled Meta-KnowledgeIEEE Transactions on Multimedia10.1109/TMM.2023.330127326(2634-2647)Online publication date: 1-Jan-2024
  • (2024)Region-Aware Portrait Retouching With Sparse Interactive GuidanceIEEE Transactions on Multimedia10.1109/TMM.2023.326218526(127-140)Online publication date: 1-Jan-2024
  • (2024)Learning Contrast-Enhanced Shape-Biased Representations for Infrared Small Target DetectionIEEE Transactions on Image Processing10.1109/TIP.2024.339101133(3047-3058)Online publication date: 24-Apr-2024
  • (2024)Texture-Guided Transfer Learning for Low-Quality Face RecognitionIEEE Transactions on Image Processing10.1109/TIP.2023.333583033(95-107)Online publication date: 1-Jan-2024
  • (2024)Benchmarking Micro-Action Recognition: Dataset, Methods, and ApplicationsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.335841534:7(6238-6252)Online publication date: 1-Jul-2024
  • (2024)Equity in Unsupervised Domain Adaptation by Nuclear Norm MaximizationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.334644434:7(5533-5545)Online publication date: 1-Jul-2024
  • (2024)Dual Circle Contrastive Learning-Based Blind Image Super-ResolutionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.329767334:3(1757-1771)Online publication date: 1-Mar-2024
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media