Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3638884.3638888acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccipConference Proceedingsconference-collections
research-article

A Fine-grained Biometric Image Recognition Method Based on Transformer

Published: 23 April 2024 Publication History

Abstract

Fine-grained biometric image recognition aims to achieve classification of subclasses by processing detailed features, which is still a critical problem to be solved in computing due to the small differences between subclasses. In recent years, Transformer model, which was originally used in natural language processing, has been applied to computer vision. The transformer model splits the image into patches and calculates the weights between different parts to obtain a better feature representation. In this paper, we propose a model of transformer for fine-grained biometric image recognition. Specifically, in the process of patch coding by the model, our model generates corresponding weights for all patches, and saves corresponding attention scores. To verify the effectiveness of our method, we conducted experiments on the CUB-200-2011 and Stanford Dog datasets.

References

[1]
Van Horn G, Branson S, Farrell R, Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 595-604.
[2]
Wah C, Branson S, Welinder P, The caltech-ucsd birds-200-2011 dataset[J]. 2011.
[3]
Khosla A, Jayadevaprakash N, Yao B, Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proc. CVPR workshop on fine-grained visual categorization (FGVC). Citeseer, 2011, 2(1).
[4]
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
[5]
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.
[6]
He K, Zhang X, Ren S, Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[7]
Ding Y, Zhou Y, Zhu Y, Selective sparse sampling for fine-grained image recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 6599-6608.
[8]
Liu C, Xie H, Zha Z J, Filtration and distillation: Enhancing region attention for fine-grained visual categorization[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 11555-11562.
[9]
Huang X, Wang X, Lv W, PP-YOLOv2: A practical object detector[J]. arXiv preprint arXiv:2104.10419, 2021.
[10]
Ren S, He K, Girshick R, Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.
[11]
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
[12]
Li X, Wang W, Hu X, Selective kernel networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 510-519.
[13]
Dosovitskiy A, Beyer L, Kolesnikov A, An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.
[14]
Liu Z, Lin Y, Cao Y, Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 10012-10022.
[15]
Carion N, Massa F, Synnaeve G, End-to-end object detection with transformers[C]//European conference on computer vision. Cham: Springer International Publishing, 2020: 213-229.
[16]
Chen J, Lu Y, Yu Q, Transunet: Transformers make strong encoders for medical image segmentation[J]. arXiv preprint arXiv:2102.04306, 2021.
[17]
Zheng S, Lu J, Zhao H, Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 6881-6890.
[18]
Vaswani A, Shazeer N, Parmar N, Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[19]
Wang X, Girshick R, Gupta A, Non-local neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7794-7803.
[20]
Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[J]. Advances in neural information processing systems, 2015, 28.
[21]
Wang Q, Wu B, Zhu P, ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11534-11542.
[22]
Woo S, Park J, Lee J Y, Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.
[23]
Fu J, Liu J, Tian H, Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 3146-3154.
[24]
Huang Z, Wang X, Huang L, Ccnet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 603-612.
[25]
Wang F, Jiang M, Qian C, Residual attention network for image classification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 3156-3164.
[26]
He S, Luo H, Wang P, Transreid: Transformer-based object re-identification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 15013-15022.
[27]
He J, Chen J N, Liu S, Transfg: A transformer architecture for fine-grained recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2022, 36(1): 852-860.

Index Terms

  1. A Fine-grained Biometric Image Recognition Method Based on Transformer

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information Processing
    December 2023
    648 pages
    ISBN:9798400708909
    DOI:10.1145/3638884
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 April 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Fine-grained biometric image recognition
    2. Transformer

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICCIP 2023

    Acceptance Rates

    Overall Acceptance Rate 61 of 301 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 17
      Total Downloads
    • Downloads (Last 12 months)17
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media