Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

IHEM Loss: Intra-Class Hard Example Mining Loss for Robust Face Recognition

Published: 01 November 2022 Publication History

Abstract

Recently, angular margin-based methods have become the mainstream approach for unconstrained face recognition with remarkable success. However, robust face recognition still remains a challenge, as the face is subject to variations in pose, age, expression, occlusion, and illumination, especially in unconstrained scenarios. Since the training dataset are always collected in unconstrained scenarios, it is inevitable that there’re significant number of hard examples in the training process. In this paper, we design a hard example selection function to effectively identify hard examples in the training procedure with the supervision of angular margin-based losses. Furthermore, a novel Intra-class Hard Example Mining (IHEM) loss function is proposed, which penalizes the cosine distance between the hard examples and their class centers to enhance the discriminative power of face representations. To ensure high performance for face recognition, we combine the supervision of angular margin-based loss and IHEM loss for model training. Specifically, during the training procedure, the angular margin-based loss guarantees the power of feature discrimination for face recognition, while the IHEM loss further encourages the intra-class compactness of hard example. Extensive results demonstrate the superiority of our approach.

References

[1]
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 815–823.
[2]
Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2016, pp. 499–515.
[3]
W. Liu, Y. Wen, Z. Yu, and M. Yang, “Large-margin softmax loss for convolutional neural networks,” in Proc. ICML, 2016, vol. 2, no. 3, p. 7.
[4]
H. Liu, X. Zhu, Z. Lei, D. Cao, and S. Z. Li, “Fast adapting without forgetting for face recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 8, pp. 3093–3104, Aug. 2021.
[5]
I. Sutskever, G. E. Hinton, and A. Krizhevsky, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
[6]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.
[7]
D. Xiao, X. Yang, J. Li, and M. Islam, “Attention deep neural network for lane marking detection,” Knowl.-Based Syst., vol. 194, Apr. 2020, Art. no.
[8]
Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace: Closing the gap to human-level performance in face verification,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 1701–1708.
[9]
J. Zhao, J. Han, and L. Shao, “Unconstrained face recognition using a set-to-set distance measure on deep learned features,” IEEE Trans. Circuits Syst. Video Technol., vol. 28, no. 10, pp. 2679–2689, Oct. 2017.
[10]
H. Wanget al., “CosFace: Large margin cosine loss for deep face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 5265–5274.
[11]
J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “ArcFace: Additive angular margin loss for deep face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 4690–4699.
[12]
W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, “SphereFace: Deep hypersphere embedding for face recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 212–220.
[13]
F. Wang, J. Cheng, W. Liu, and H. Liu, “Additive margin softmax for face verification,” IEEE Signal Process. Lett., vol. 25, no. 7, pp. 926–930, Jul. 2018.
[14]
Y. Liuet al., “Boosting semi-supervised face recognition with noise robustness,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 2, pp. 778–787, Feb. 2022.
[15]
S. Yang, W. Deng, M. Wang, J. Du, and J. Hu, “Orthogonality loss: Learning discriminative representations for face recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 6, pp. 2301–2314, Jun. 2021.
[16]
B. Liet al., “Dynamic class queue for large scale face recognition in the wild,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 3763–3772.
[17]
H. Liu, X. Zhu, Z. Lei, and S. Z. Li, “AdaptiveFace: Adaptive margin and sampling for face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 11947–11956.
[18]
X. Wang, S. Zhang, S. Wang, T. Fu, H. Shi, and T. Mei, “Mis-classified vector guided softmax loss for face recognition,” in Proc. AAAI Conf. Artif. Intell., Apr. 2020, vol. 34, no. 7, pp. 12241–12248.
[19]
Q. Meng, S. Zhao, Z. Huang, and F. Zhou, “MagFace: A universal representation for face recognition and quality assessment,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 14225–14234.
[20]
E. Xing and A. Ng, “Distance metric learning with application to clustering with side-information,” in Proc. Conf. Neural Inf. Process. Syst., 2002, vol. 15, nos. 505–512, p. 12.
[21]
Z. Dinget al., “Robust discriminative metric learning for image representation,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 11, pp. 3173–3183, Nov. 2019.
[22]
G. Pereyra, G. Tucker, J. Chorowski,Ł. Kaiser, and G. Hinton, “Regularizing neural networks by penalizing confident output distributions,” 2017, arXiv:1701.06548.
[23]
F. Wang, X. Xiang, J. Cheng, and A. L. Yuille, “NormFace: L2 hypersphere embedding for face verification,” in Proc. 25th ACM Int. Conf. Multimedia, Oct. 2017, pp. 1041–1049.
[24]
B. Harwood, V. Kumar, G. Carneiro, I. Reid, and T. Drummond, “Smart mining for deep metric learning,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2821–2829.
[25]
C. Wang, X. Lan, and X. Zhang, “How to train triplet networks with 100K identities?” in Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCVW), Oct. 2017, pp. 1907–1915.
[26]
Y. Suh, B. Han, W. Kim, and K. M. Lee, “Stochastic class-based hard example mining for deep metric learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 7251–7259.
[27]
E. Smirnov, A. Melnikov, S. Novoselov, E. Luckyanets, and G. Lavrentyeva, “Doppelganger mining for face representation learning,” in Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCVW), Oct. 2017, pp. 1916–1923.
[28]
B. Gajic, A. Amato, R. Baldrich, and C. Gatta, “Bag of negatives for Siamese architectures,” 2019, arXiv:1908.02391.
[29]
A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 761–769.
[30]
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2980–2988.
[31]
Y. Huanget al., “CurricularFace: Adaptive curriculum learning loss for deep face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 5901–5910.
[32]
X. Wang and A. Gupta, “Unsupervised learning of visual representations using videos,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 2794–2802.
[33]
X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, “Range loss for deep face recognition with long-tailed training data,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 5409–5418.
[34]
Y. Wu, H. Liu, J. Li, and Y. Fu, “Deep face recognition with center invariant loss,” in Proc. Thematic Workshops ACM Multimedia, 2017, pp. 408–414.
[35]
D. Yi, Z. Lei, S. Liao, and S. Z. Li, “Learning face representation from scratch,” 2014, arXiv:1411.7923.
[36]
J. Deng, J. Guo, D. Zhang, Y. Deng, X. Lu, and S. Shi, “Lightweight face recognition challenge,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop (ICCVW), Oct. 2019, pp. 2638–2646.
[37]
Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, “MS-celeb-1M: A dataset and benchmark for large-scale face recognition,” in Proc. Eur. Conf. Comput. Vis. Cham, Switzerland: Springer, 2016, pp. 87–102.
[38]
G. B. Huanget al., “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” in Proc. Workshop Faces ‘Real-Life’ Images: Detection, Alignment, Recognit., 2008.
[39]
T. Zheng, W. Deng, and J. Hu, “Cross-age LFW: A database for studying Cross-age face recognition in unconstrained environments,” 2017, arXiv:1708.08197.
[40]
T. Zheng and W. Deng, “Cross-pose LFW: A database for studying cross-pose face recognition in unconstrained environments,” Beijing Univ. Posts Telecommun., Beijing, China, Tech. Rep. 5, 2018.
[41]
S. Sengupta, J.-C. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs, “Frontal to profile face verification in the wild,” in Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Mar. 2016, pp. 1–9.
[42]
S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou, “AgeDB: The first manually collected, in-the-wild age database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2017, pp. 51–59.
[43]
I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard, “The MegaFace benchmark: 1 million faces for recognition at scale,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 4873–4882.
[44]
C. Whitelamet al., “IARPA Janus benchmark-B face dataset,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2017, pp. 90–98.
[45]
B. Mazeet al., “IARPA Janus benchmark–C: Face dataset and protocol,” in Proc. Int. Conf. Biometrics (ICB), Feb. 2018, pp. 158–165.
[46]
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016.
[47]
J. Deng, J. Guo, E. Ververas, I. Kotsia, and S. Zafeiriou, “RetinaFace: Single-shot multi-level face localisation in the wild,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 5203–5212.
[48]
Y. Kim, W. Park, M.-C. Roh, and J. Shin, “GroupFace: Learning latent groups and constructing group-based representations for face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 5621–5630.
[49]
S. Li, J. Xu, X. Xu, P. Shen, S. Li, and B. Hooi, “Spherical confidence learning for face recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 15629–15637.
[50]
H.-W. Ng and S. Winkler, “A data-driven approach to cleaning large face datasets,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2014, pp. 343–347.

Cited By

View all
  • (2024)Hand Gesture Authentication by Discovering Fine-Grained Spatiotemporal Identity CharacteristicsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328646034:1(461-474)Online publication date: 1-Jan-2024
  • (2023)Self-Paced Hard Task-Example Mining for Few-Shot ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.326359333:10(5631-5644)Online publication date: 1-Oct-2023
  • (2023)PoiseNet: Dealing With Data Imbalance in DensePoseIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.325319033:10(5664-5678)Online publication date: 1-Oct-2023

Index Terms

  1. IHEM Loss: Intra-Class Hard Example Mining Loss for Robust Face Recognition
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Circuits and Systems for Video Technology
        IEEE Transactions on Circuits and Systems for Video Technology  Volume 32, Issue 11
        Nov. 2022
        808 pages

        Publisher

        IEEE Press

        Publication History

        Published: 01 November 2022

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 05 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Hand Gesture Authentication by Discovering Fine-Grained Spatiotemporal Identity CharacteristicsIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328646034:1(461-474)Online publication date: 1-Jan-2024
        • (2023)Self-Paced Hard Task-Example Mining for Few-Shot ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.326359333:10(5631-5644)Online publication date: 1-Oct-2023
        • (2023)PoiseNet: Dealing With Data Imbalance in DensePoseIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.325319033:10(5664-5678)Online publication date: 1-Oct-2023

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media