Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664647.3681231acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

RobustFace: Adaptive Mining of Noise and Hard Samples for Robust Face Recognitions

Published: 28 October 2024 Publication History

Abstract

While margin-based deep face recognition models, such as ArcFace and AdaFace, have achieved remarkable successes over recent years, they may suffer from degraded performances when encountering training sets corrupted with noises. This is often inevitable when massively large scale datasets need to be dealt with, yet it remains difficult to construct clean enough face datasets under these circumstances. In this paper, we propose a robust deep face recognition model, RobustFace, by combining the advantages of margin-based learning models with the strength of mining-based approaches to effectively mitigate the impact of noises during trainings. Specifically, we introduce a noise-adaptive mining strategy to dynamically adjust the emphasis balance between hard and noise samples by monitoring the model's recognition performances at the batch level to provide optimization-oriented feedback, enabling direct training on noisy datasets without the requirement of pre-training. Extensive experiments validate that our proposed RobustFace achieves competitive performances in comparison with the existing SoTA models when trained with clean datasets. When trained with both real-world and synthetic noisy datasets, RobustFace significantly outperforms the existing models, especially when the synthetic noisy datasets are corrupted with both close-set and open-set noises. While the existing baseline models suffer from an average performance drop of around 40%, under these circumstances, our proposed still delivers accuracy rates of more than 90%.

References

[1]
Xiang An, Jiankang Deng, Jia Guo, Ziyong Feng, Xu Han Zhu, Jing Yang, and Tongliang Liu. 2022. Killing two birds with one stone: Efficient and robust training of face recognition cnns by partial fc. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4042--4051.
[2]
Xiang An, Xuhan Zhu, Yuan Gao, Yang Xiao, Yongle Zhao, Ziyong Feng, Lan Wu, Bin Qin, Ming Zhang, Debing Zhang, et al. 2021. Partial fc: Training 10 million identities on a single machine. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1445--1449.
[3]
Andrew D Bagdanov, Alberto Del Bimbo, and Iacopo Masi. 2011. The florence 2d/3d hybrid face dataset. In Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding. 79--80.
[4]
Fadi Boutros, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. 2022. Elasticface: Elastic margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1578--1587.
[5]
Jiankang Deng, Jia Guo, Tongliang Liu, Mingming Gong, and Stefanos Zafeiriou. 2020. Sub-center arcface: Boosting face recognition by large-scale noisy web faces. In Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XI 16. Springer, 741--757.
[6]
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4690--4699.
[7]
Yanglin Feng, Hongyuan Zhu, Dezhong Peng, Xi Peng, and Peng Hu. 2023. ROAD: Robust Unsupervised Domain Adaptation with Noisy Labels. In Proceedings of the 31st ACM International Conference on Multimedia. 7264--7273.
[8]
Aritra Ghosh, Himanshu Kumar, and P Shanti Sastry. 2017. Robust loss functions under label noise for deep neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 31.
[9]
Jacob Goldberger and Ehud Ben-Reuven. 2016. Training deep neural-networks using a noise adaptation layer. In International conference on learning representations.
[10]
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14. Springer, 87--102.
[11]
Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. 2018. Using trusted data to train deep networks on labels corrupted by severe noise. Advances in neural information processing systems, Vol. 31 (2018).
[12]
Wei Hu, Yangyu Huang, Fan Zhang, and Ruirui Li. 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11887--11896.
[13]
Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on faces in'Real-Life'Images: detection, alignment, and recognition.
[14]
Yuge Huang, Yuhan Wang, Ying Tai, Xiaoming Liu, Pengcheng Shen, Shaoxin Li, Jilin Li, and Feiyue Huang. 2020. Curricularface: adaptive curriculum learning loss for deep face recognition. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5901--5910.
[15]
Minchul Kim, Anil K Jain, and Xiaoming Liu. 2022. Adaface: Quality adaptive margin for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18750--18759.
[16]
Tae-Hoon Kim and Jonghyun Choi. 2018. ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks. arXiv e-prints (2018), arXiv-1801.
[17]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.
[18]
Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 212--220.
[19]
Brianna Maze, Jocelyn Adams, James A Duncan, Nathan Kalka, Tim Miller, Charles Otto, Anil K Jain, W Tyler Niggel, Janet Anderson, Jordan Cheney, et al. 2018. Iarpa janus benchmark-c: Face dataset and protocol. In 2018 international conference on biometrics (ICB). IEEE, 158--165.
[20]
Qiang Meng, Shichao Zhao, Zhida Huang, and Feng Zhou. 2021. Magface: A universal representation for face recognition and quality assessment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14225--14234.
[21]
Stylianos Moschoglou, Athanasios Papaioannou, Christos Sagonas, Jiankang Deng, Irene Kotsia, and Stefanos Zafeiriou. 2017. Agedb: the first manually collected, in-the-wild age database. In proceedings of the IEEE conference on computer vision and pattern recognition workshops. 51--59.
[22]
Hong-Wei Ng and Stefan Winkler. 2014. A data-driven approach to cleaning large face datasets. In 2014 IEEE international conference on image processing (ICIP). IEEE, 343--347.
[23]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
[24]
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1944--1952.
[25]
Soumyadip Sengupta, Jun-Cheng Chen, Carlos Castillo, Vishal M Patel, Rama Chellappa, and David W Jacobs. 2016. Frontal to profile face verification in the wild. In 2016 IEEE winter conference on applications of computer vision (WACV). IEEE, 1--9.
[26]
Jianjian Shao, Zhenqian Wu, Yuanyan Luo, Shudong Huang, Xiaorong Pu, and Yazhou Ren. 2022. Self-paced label distribution learning for in-the-wild facial expression recognition. In Proceedings of the 30th ACM International Conference on Multimedia. 161--169.
[27]
Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. 2016. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition. 761--769.
[28]
Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, and Rob Fergus. 2015. Training convolutional networks with noisy labels. In 3rd International Conference on Learning Representations, ICLR 2015.
[29]
Fei Wang, Liren Chen, Cheng Li, Shiyao Huang, Yanjie Chen, Chen Qian, and Chen Change Loy. 2018. The devil of face recognition is in the noise. In Proceedings of the European Conference on Computer Vision (ECCV). 765--780.
[30]
Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. 2018. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5265--5274.
[31]
Xiaobo Wang, Shuo Wang, Yanyan Liang, Liang Gu, and Zhen Lei. 2022. Rvface: Reliable vector guided softmax loss for face recognition. IEEE Transactions on Image Processing, Vol. 31 (2022), 2337--2351.
[32]
Xiaobo Wang, Shifeng Zhang, Shuo Wang, Tianyu Fu, Hailin Shi, and Tao Mei. 2020. Mis-classified vector guided softmax loss for face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12241--12248.
[33]
Cameron Whitelam, Emma Taborsky, Austin Blanton, Brianna Maze, Jocelyn Adams, Tim Miller, Nathan Kalka, Anil K Jain, James A Duncan, Kristen Allen, et al. 2017. Iarpa janus benchmark-b face dataset. In proceedings of the IEEE conference on computer vision and pattern recognition workshops. 90--98.
[34]
Shijie Wu and Xun Gong. 2022. BoundaryFace: A mining framework with noise label self-correction for Face Recognition. In European Conference on Computer Vision. Springer, 91--106.
[35]
Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. 2018. A light CNN for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, Vol. 13, 11 (2018), 2884--2896.
[36]
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE signal processing letters, Vol. 23, 10 (2016), 1499--1503.
[37]
Xiao Zhang, Rui Zhao, Yu Qiao, Xiaogang Wang, and Hongsheng Li. 2019. Adacos: Adaptively scaling cosine logits for effectively learning deep face representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10823--10832.
[38]
Tianyue Zheng and Weihong Deng. 2018. Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments. Beijing University of Posts and Telecommunications, Tech. Rep, Vol. 5, 7 (2018).
[39]
Tianyue Zheng, Weihong Deng, and Jiani Hu. 2017. Cross-Age LFW: A Database for Studying Cross-Age Face Recognition in Unconstrained Environments. arXiv e-prints (2017), arXiv-1708.
[40]
Yaoyao Zhong, Weihong Deng, Han Fang, Jiani Hu, Dongyue Zhao, Xian Li, and Dongchao Wen. 2021. Dynamic training data dropout for robust deep face recognition. IEEE Transactions on Multimedia, Vol. 24 (2021), 1186--1197.
[41]
Yaoyao Zhong, Weihong Deng, Jiani Hu, Dongyue Zhao, Xian Li, and Dongchao Wen. 2021. SFace: Sigmoid-constrained hypersphere loss for robust face recognition. IEEE Transactions on Image Processing, Vol. 30 (2021), 2587--2598.
[42]
Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, Junjie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jiwen Lu, Dalong Du, et al. 2021. Webface260m: A benchmark unveiling the power of million-scale deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10492--10502.

Index Terms

  1. RobustFace: Adaptive Mining of Noise and Hard Samples for Robust Face Recognitions

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. face recognition
    2. hard sample mining
    3. noise label
    4. noise-resistant

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 31
      Total Downloads
    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 05 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media