research-article

A Fine-grained Biometric Image Recognition Method Based on Transformer

Authors:

Man ZhangAuthors Info & Claims

ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information Processing

Pages 19 - 23

https://doi.org/10.1145/3638884.3638888

Published: 23 April 2024 Publication History

Abstract

Fine-grained biometric image recognition aims to achieve classification of subclasses by processing detailed features, which is still a critical problem to be solved in computing due to the small differences between subclasses. In recent years, Transformer model, which was originally used in natural language processing, has been applied to computer vision. The transformer model splits the image into patches and calculates the weights between different parts to obtain a better feature representation. In this paper, we propose a model of transformer for fine-grained biometric image recognition. Specifically, in the process of patch coding by the model, our model generates corresponding weights for all patches, and saves corresponding attention scores. To verify the effectiveness of our method, we conducted experiments on the CUB-200-2011 and Stanford Dog datasets.

References

[1]

Van Horn G, Branson S, Farrell R, Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 595-604.

[2]

Wah C, Branson S, Welinder P, The caltech-ucsd birds-200-2011 dataset[J]. 2011.

[3]

Khosla A, Jayadevaprakash N, Yao B, Novel dataset for fine-grained image categorization: Stanford dogs[C]//Proc. CVPR workshop on fine-grained visual categorization (FGVC). Citeseer, 2011, 2(1).

[4]

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[5]

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.

[6]

He K, Zhang X, Ren S, Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[7]

Ding Y, Zhou Y, Zhu Y, Selective sparse sampling for fine-grained image recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 6599-6608.

[8]

Liu C, Xie H, Zha Z J, Filtration and distillation: Enhancing region attention for fine-grained visual categorization[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 11555-11562.

[9]

Huang X, Wang X, Lv W, PP-YOLOv2: A practical object detector[J]. arXiv preprint arXiv:2104.10419, 2021.

[10]

Ren S, He K, Girshick R, Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.

[11]

Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[12]

Li X, Wang W, Hu X, Selective kernel networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 510-519.

[13]

Dosovitskiy A, Beyer L, Kolesnikov A, An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.

[14]

Liu Z, Lin Y, Cao Y, Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 10012-10022.

[15]

Carion N, Massa F, Synnaeve G, End-to-end object detection with transformers[C]//European conference on computer vision. Cham: Springer International Publishing, 2020: 213-229.

[16]

Chen J, Lu Y, Yu Q, Transunet: Transformers make strong encoders for medical image segmentation[J]. arXiv preprint arXiv:2102.04306, 2021.

[17]

Zheng S, Lu J, Zhao H, Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 6881-6890.

[18]

Vaswani A, Shazeer N, Parmar N, Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

[19]

Wang X, Girshick R, Gupta A, Non-local neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7794-7803.

[20]

Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[J]. Advances in neural information processing systems, 2015, 28.

[21]

Wang Q, Wu B, Zhu P, ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11534-11542.

[22]

Woo S, Park J, Lee J Y, Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

[23]

Fu J, Liu J, Tian H, Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 3146-3154.

[24]

Huang Z, Wang X, Huang L, Ccnet: Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 603-612.

[25]

Wang F, Jiang M, Qian C, Residual attention network for image classification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 3156-3164.

[26]

He S, Luo H, Wang P, Transreid: Transformer-based object re-identification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 15013-15022.

[27]

He J, Chen J N, Liu S, Transfg: A transformer architecture for fine-grained recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2022, 36(1): 852-860.

Index Terms

A Fine-grained Biometric Image Recognition Method Based on Transformer
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Biometrics

Recommendations

Biometric recognition using finger and palm vein images

In recent times, biometrics is the best alternative for the token-based and knowledge-based security systems. Out of the existing biometric modalities, the vascular biometric modalities are preferred for authenticating the person, because of its ...
Multimodal biometric system for ECG, ear and iris recognition based on local descriptors
Abstract
Combination of multiple information extracted from different biometric modalities in multimodal biometric recognition system aims to solve the different drawbacks encountered in a unimodal biometric system. Fusion of many biometrics has proposed ...
A novel biometric system based on palm vein image

Vein pattern recognition is one of the newest biometric techniques researched today. In this paper, one of the reliable and robust personal identification authentication approaches using palm vein patterns is presented. We consider the palm vein as a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCIP '23: Proceedings of the 2023 9th International Conference on Communication and Information Processing

December 2023

648 pages

ISBN:9798400708909

DOI:10.1145/3638884

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

ICCIP 2023

ICCIP 2023: 2023 the 9th International Conference on Communication and Information Processing

December 14 - 16, 2023

Lingshui, China

Acceptance Rates

Overall Acceptance Rate 61 of 301 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
17
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)6

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents