Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3581783.3612329acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cross-Modal and Multi-Attribute Face Recognition: A Benchmark

Published: 27 October 2023 Publication History

Abstract

Face recognition has made significant advances with the development of deep learning and has begun to be deployed in some unrestricted scenarios. Many smartphones, for example, have infrared sensors that allow them to capture clear images even in low-light conditions. Face authentication under complex environmental conditions can thus be accomplished by matching NIR-VIS face images across modalities. However, existing NIR-VIS datasets lack enough variation in face attributes and are insufficient for real-world scenarios. To address the aforementioned issues, we first propose a 300-person NIR-VIS cross-modality face dataset with a variety of attributes. Based on modal information removal, we proposed a NIR-VIS cross-modal face recognition model. We can effectively extract modal information by constraining the similarity distribution of modalities and then using the orthogonal loss to remove modal information from identity features. The method achieves excellent results on our dataset and CASIA NIR-VIS 2.0 dataset.

References

[1]
Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. 2021. Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows. ACM Transactions on Graphics (TOG) 40, 3 (2021), 1--21.
[2]
Apple. 2022. Website. https://support.apple.com/en-sg/HT208108#: text=Security%20safeguards&text=Face%20ID%20uses%20the%20TrueDepth, only%20to%20the%20Secure%20Enclave.
[3]
J. Sun D. Huang and Y. Wang. 2012. The buaa-visnir face database instructions. (2012).
[4]
Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild. In CVPR.
[5]
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 4690--4699. https://doi.org/10. 1109/CVPR.2019.00482
[6]
Zhongying Deng, Xiaojiang Peng, Zhifeng Li, and Yu Qiao. 2019. Mutual Component Convolutional Neural Networks for Heterogeneous Face Recognition. IEEE Trans. Image Process. 28, 6 (2019), 3102--3114. https://doi.org/10.1109/TIP.2019. 2894272
[7]
Zhongying Deng, Xiaojiang Peng, and Yu Qiao. 2019. Residual Compensation Networks for Heterogeneous Face Recognition. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 8239--8246. https://doi.org/10.1609/aaai.v33i01.33018239
[8]
Boyan Duan, Chaoyou Fu, Yi Li, Xingguang Song, and Ran He. 2020. Cross-Spectral Face Hallucination via Disentangling Independent Factors. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020. Computer Vision Foundation / IEEE, 7927--7935. https://doi.org/10.1109/CVPR42600.2020.00795
[9]
Face. 2022. Website. https://www.faceplusplus.com/sdk/face-comparing/.
[10]
Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, and Zeynep Akata. 2020. Learning robust representations via multi-view information bottleneck. arXiv preprint arXiv:2002.07017 (2020).
[11]
Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, and Ran He. 2021. Dvg-face: Dual variational generation for heterogeneous face recognition. IEEE transactions on pattern analysis and machine intelligence (2021).
[12]
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge.
[13]
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. Msceleb-1m: A dataset and benchmark for large-scale face recognition. In European conference on computer vision. Springer, 87--102.
[14]
Ran He, Yi Li, Xiang Wu, Lingxiao Song, Zhenhua Chai, and Xiaolin Wei. 2021. Coupled adversarial learning for semi-supervised heterogeneous face recognition. Pattern Recognition 110 (2021), 107618.
[15]
Ran He, Xiang Wu, Zhenan Sun, and Tieniu Tan. 2017. Learning invariant deep representation for nir-vis face recognition. In Thirty-First AAAI Conference on Artificial Intelligence.
[16]
Ran He, XiangWu, Zhenan Sun, and Tieniu Tan. 2018. Wasserstein cnn: Learning invariant features for nir-vis face recognition. IEEE transactions on pattern analysis and machine intelligence 41, 7 (2018), 1761--1773.
[17]
Honorsmartscreen. 2022. Website. https://www.honor.cn/products/wisdomscreen/ honorsmartscreen/.
[18]
Weipeng Hu and Haifeng Hu. 2022. Domain-Private Factor Detachment Network for NIR-VIS Face Recognition. IEEE Transactions on Information Forensics and Security 17 (2022), 1435--1449.
[19]
Weipeng Hu, Wenjun Yan, and Haifeng Hu. 2021. Dual face alignment learning network for NIR-VIS face recognition. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2411--2424.
[20]
Huawei. 2022. Website. https://consumer.huawei.com/cn/phones/mate30e-pro/.
[21]
Huawei. 2022. Website. https://consumer.huawei.com/cn/phones/mate40e-pro/.
[22]
Bin Jiang, Qiang Ren, Fei Dai, Jian Xiong, Jie Yang, and Guan Gui. 2018. Multi-task cascaded convolutional neural networks for real-time dynamic face recognition method. In International conference in communications, signal processing, and systems. Springer, 59--66.
[23]
Jiayn. 2022. Website. http://www.jiayn.cn/.
[24]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436--444.
[25]
Zhen Lei and Stan Z Li. 2009. Coupled spectral regression for matching heterogeneous faces. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1123--1128.
[26]
Stan Li, Dong Yi, Zhen Lei, and Shengcai Liao. 2013. The casia nir-vis 2.0 face database. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 348--353.
[27]
Stan Z. Li, Rufeng Chu, Shengcai Liao, and Lun Zhang. 2007. Illumination Invariant Face Recognition Using Near-Infrared Images. IEEE Trans. Pattern Anal. Mach. Intell. 29, 4 (2007), 627--639. https://doi.org/10.1109/TPAMI.2007.1014
[28]
Stan Z. Li, Zhen Lei, and Meng Ao. 2009. The HFB Face Database for Heterogeneous Face Biometrics research. In IEEE Conference on Computer Vision and Pattern Recognition, CVPRWorkshops 2009, Miami, FL, USA, 20-25 June, 2009. IEEE Computer Society, 1--8. https://doi.org/10.1109/CVPRW.2009.5204149
[29]
Wenyu Li, Tianchu Guo, Pengyu Li, Binghui Chen, Biao Wang, Wangmeng Zuo, and Lei Zhang. 2021. Virface: Enhancing face recognition via unlabeled shallow data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14729--14738.
[30]
Xiaoxiang Liu, Lingxiao Song, XiangWu, and Tieniu Tan. 2016. Transferring deep representation for nir-vis heterogeneous face recognition. In 2016 International Conference on Biometrics (ICB). IEEE, 1--8.
[31]
Zhenguang Liu, Haoming Chen, Runyang Feng, Shuang Wu, Shouling Ji, Bailin Yang, and Xun Wang. 2021. Deep Dual Consecutive Network for Human Pose Estimation. In CVPR. 525--534. https://doi.org/10.1109/CVPR46437.2021.00059
[32]
Zhenguang Liu, ShuangWu, Shuyuan Jin, Qi Liu, Shijian Lu, Roger Zimmermann, and Li Cheng. 2019. Towards Natural and Accurate Future Motion Prediction of Humans and Animals. In CVPR. 10004--10012. https://doi.org/10.1109/CVPR. 2019.01024
[33]
Iacopo Masi, Yue Wu, Tal Hassner, and Prem Natarajan. 2018. Deep face recognition: A survey. In 2018 31st SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE, 471--478.
[34]
Haibo Qiu, Baosheng Yu, Dihong Gong, Zhifeng Li, Wei Liu, and Dacheng Tao. 2021. Synface: Face recognition with synthetic data. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10880--10890.
[35]
Christopher Reale, Nasser M Nasrabadi, Heesung Kwon, and Rama Chellappa. 2016. Seeing the forest from the trees: A holistic approach to near-infrared heterogeneous face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 54--62.
[36]
Christopher Reale, Nasser M. Nasrabadi, Heesung Kwon, and Rama Chellappa. 2016. Seeing the Forest from the Trees: A Holistic Approach to Near-Infrared Heterogeneous Face Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2016, Las Vegas, NV, USA, June 26 - July 1, 2016. IEEE Computer Society, 320--328. https://doi.org/10.1109/ CVPRW.2016.47
[37]
M Saquib Sarfraz and Rainer Stiefelhagen. 2015. Deep Perceptual Mapping for Thermal to Visible Face Recognition. arXiv e-prints (2015), arXiv--1507.
[38]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7--12, 2015. IEEE Computer Society, 815--823. https://doi.org/10.1109/CVPR.2015.7298682
[39]
Jonathon Shlens. 2014. A Tutorial on Principal Component Analysis. CoRR abs/1404.1100 (2014). arXiv:1404.1100 http://arxiv.org/abs/1404.1100
[40]
Lingxiao Song, Man Zhang, Xiang Wu, and Ran He. 2018. Adversarial discriminative heterogeneous face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[41]
Zongcai Sun, Chaoyou Fu, Mandi Luo, and Ran He. 2021. Self-Augmented Heterogeneous Face Recognition. In 2021 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 1--8.
[42]
Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie, and Lizhuang Ma. 2021. Farewell to mutual information: Variational distillation for cross-modal person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1522--1531.
[43]
Huijiao Wang, Haijian Zhang, Lei Yu, Li Wang, and Xulei Yang. 2020. Facial Feature Embedded Cyclegan For Vis-Nir Translation. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. IEEE, 1903--1907. https://doi.org/10.1109/ICASSP40776.2020.9054007
[44]
Xiaogang Wang and Xiaoou Tang. 2008. Face photo-sketch synthesis and recognition. IEEE transactions on pattern analysis and machine intelligence 31, 11 (2008), 1955--1967.
[45]
Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. 2018. A Light CNN for Deep Face Representation With Noisy Labels. IEEE Trans. Inf. Forensics Secur. 13, 11 (2018), 2884--2896. https://doi.org/10.1109/TIFS.2018.2833032
[46]
Xiang Wu, Huaibo Huang, Vishal M Patel, Ran He, and Zhenan Sun. 2019. Disentangled variational representation for heterogeneous face recognition. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 9005--9012.
[47]
Xiang Wu, Lingxiao Song, Ran He, and Tieniu Tan. 2018. Coupled deep learning for heterogeneous face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[48]
Xiaolin Xu, Yidong Li, and Yi Jin. 2020. Hierarchical discriminant feature learning for cross-modal face recognition. Multimedia Tools and Applications 79, 45 (2020), 33483--33502.
[49]
Shanmin Yang, Keren Fu, Xiao Yang, Ye Lin, Jianwei Zhang, and Cheng Peng. 2020. Learning Domain-Invariant Discriminative Features for Heterogeneous Face Recognition. IEEE Access 8 (2020), 209790--209801.
[50]
Ziming Yang, Jian Liang, Chaoyou Fu, Mandi Luo, and Xiao-Yu Zhang. 2022. Heterogeneous Face Recognition via Face Synthesis With Identity-Attribute Disentanglement. IEEE Transactions on Information Forensics and Security 17 (2022), 1344--1358.
[51]
Aijing Yu, Haoxue Wu, Huaibo Huang, Zhen Lei, and Ran He. 2021. Lamp-hq: A large-scale multi-pose high-quality database and benchmark for nir-vis face recognition. International Journal of Computer Vision 129, 5 (2021), 1467--1483.
[52]
Wenyi Zhao, Rama Chellappa, P Jonathon Phillips, and Azriel Rosenfeld. 2003. Face recognition: A literature survey. ACM computing surveys (CSUR) 35, 4 (2003), 399--458.
[53]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Cited By

View all
  • (2024)Multi-attribute Semantic Adversarial Attack Based on Cross-layer Interpolation for Face Recognition2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650828(1-9)Online publication date: 30-Jun-2024

Index Terms

  1. Cross-Modal and Multi-Attribute Face Recognition: A Benchmark

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal dataset
    2. face recognition
    3. feature extraction

    Qualifiers

    • Research-article

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)170
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Multi-attribute Semantic Adversarial Attack Based on Cross-layer Interpolation for Face Recognition2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650828(1-9)Online publication date: 30-Jun-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media