Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
applsci-logo

Journal Browser

Journal Browser

Applications of Image Processing and Pattern Recognition in Biometrics

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 10 June 2025 | Viewed by 11742

Special Issue Editors


E-Mail Website
Guest Editor
Department of Electrical and Electronics Engineering, University of West Attica, 12243 Athens, Greece
Interests: digital signal; image processing; computer vision; pattern recognition; handwriting biometry
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Electrical and Electronics Engineering, University of West Attica, 12243 Athens, Greece
Interests: Image processing; video processing; pattern recognition; image and video communications

Special Issue Information

Dear Colleagues,

We would like to invite you to submit your research work to our Special Issue, “Applications of Image Processing and Pattern Recognition in Biometrics”.

This Special Issue focuses upon the interdisciplinary processing, extraction of information, and recognition of all types of biometric content contained in digital images, with important applications in high-level understanding. Biometrics is the task of identifying individuals with the use of their physiological or behavioral traits. A number of relative scientific disciplines describes this topic; among others acquisition, image analysis, machine learning, computer vision are some notable examples. This Special Issue aims to bring together and present recent advances in theory, comprehensive studies, and surveys, as well as contemporary and new research ideas in order to process and understand the biometric content of images as a whole or as a part of a combined infrastructure. Submissions should provide progress and advance the state-of-the-art at image processing, computer vision, and pattern recognition/machine learning fields with a primary focus on physiological and behavioral biometric attributes.

Broad topics/keywords and areas of interest include, but are not limited to:

  • Individual biometric modalities, including established and traditional, contemporary modalities, as well as any type of their fusion.
  • Hardware technologies for biometric image processing and pattern recognition.
  • Core groundwork theory and applications on image processing, pattern recognition, machine learning, and computer vision techniques relevant to biometrics processing and analysis.
  • Datasets, evaluation, and benchmarking.
  • Image processing, pattern recognition, machine learning, and computer vision for biometrics in healthcare, banking, IOT, etc.
  • Image processing, pattern recognition, machine learning, and computer vision for biometrics in forensic applications.

Dr. Elias N. Zois
Prof. Dr. Dimitrios Kalivas
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

31 pages, 4101 KiB  
Article
Fingerprint Classification Based on Multilayer Extreme Learning Machines
by Axel Quinteros and David Zabala-Blanco
Appl. Sci. 2025, 15(5), 2793; https://doi.org/10.3390/app15052793 - 5 Mar 2025
Viewed by 198
Abstract
Fingerprint recognition is one of the most effective and widely adopted methods for person identification. However, the computational time required for the querying of large databases is excessive. To address this, preprocessing steps such as classification are necessary to speed up the response [...] Read more.
Fingerprint recognition is one of the most effective and widely adopted methods for person identification. However, the computational time required for the querying of large databases is excessive. To address this, preprocessing steps such as classification are necessary to speed up the response time to a query. Fingerprints are typically categorized into five classes, though this classification is unbalanced. While advanced classification algorithms, including support vector machines (SVMs), multilayer perceptrons (MLPs), and convolutional neural networks (CNNs), have demonstrated near-perfect accuracy (approaching 100%), their high training times limit their widespread applicability across institutions. In this study, we introduce, for the first time, the use of a multilayer extreme learning machine (M-ELM) for fingerprint classification, aiming to improve training efficiency. A comparative analysis is conducted with CNNs and unbalanced extreme learning machines (W-ELMs), as these represent the most influential methodologies in the literature. The tests utilize a database generated by SFINGE software, which simulates realistic fingerprint distributions, with datasets comprising hundreds of thousands of samples. To optimize and simplify the M-ELM, widely recognized descriptors in the field—Capelli02, Liu10, and Hong08—are used as input features. This effectively reduces dimensionality while preserving the representativeness of the fingerprint information. A brute-force heuristic optimization approach is applied to determine the hyperparameters that maximize classification accuracy across different M-ELM configurations while avoiding excessive training times. A comparison is made with the aforementioned approaches in terms of accuracy, penetration rate, and computational cost. The results demonstrate that a two-layer hidden ELM achieves superior classification of both majority and minority fingerprint classes with remarkable computational efficiency. Full article
Show Figures

Figure 1

Figure 1
<p>General architecture of an original ELM.</p>
Full article ">Figure 2
<p>Representative structure of a multilayer ELM.</p>
Full article ">Figure 3
<p>General architecture of an ELM-AE. The colors indicate the type of neuron, following the same scheme as the original ELM.</p>
Full article ">Figure 4
<p>Samples concerning fingerprint image quality: (<b>a</b>) default; (<b>b</b>) HQNoPert; (<b>c</b>) VQAndPert.</p>
Full article ">Figure 5
<p>The accuracy in the training and validation phases in terms of the number of hidden neurons of the original ELM, considering Capelli02 as a descriptor.</p>
Full article ">Figure 6
<p>The accuracy vs. the number of hidden neurons of the original ELM in the training and validation phases when the descriptor corresponds to Hong08.</p>
Full article ">Figure 7
<p>The accuracy vs. the number of hidden neurons of the original ELM in the training and validation phases when the descriptor corresponds to Liu10.</p>
Full article ">Figure 8
<p>Accuracy in terms of the number of neurons of the two-layer hidden ELM considering the Capelli02 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 9
<p>Accuracy in terms of the number of neurons of the ELM of two hidden layers considering the Hong08 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 10
<p>Accuracy as a function of the number of neurons of the two-hidden-layer ELM considering the Liu10 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 11
<p>Accuracy as a function of the number of neurons of the three-hidden-layer ELM, taking into account the Capelli02 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 12
<p>Accuracy as a function of the number of neurons of the three-hidden-layer ELM, taking into account the Hong08 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 13
<p>Accuracy as a function of the number of neurons of the three-hidden-layer ELM, taking into account the Liu10 descriptor and the (<b>a</b>) default, (<b>b</b>) HQNoPert, and (<b>c</b>) VQAndPert databases.</p>
Full article ">Figure 14
<p>Confusion matrices for the ELM-M2 and ELM-M3 models.</p>
Full article ">
20 pages, 3791 KiB  
Article
Impact of Texture Feature Count on the Accuracy of Osteoporotic Change Detection in Computed Tomography Images of Trabecular Bone Tissue
by Róża Dzierżak
Appl. Sci. 2025, 15(3), 1528; https://doi.org/10.3390/app15031528 - 2 Feb 2025
Viewed by 751
Abstract
The aim of this study is to compare the classification accuracy depending on the number of texture features used. This study used 400 computed tomography (CT) images of trabecular spinal tissue from 100 patients belonging to two groups (50 control patients and 50 [...] Read more.
The aim of this study is to compare the classification accuracy depending on the number of texture features used. This study used 400 computed tomography (CT) images of trabecular spinal tissue from 100 patients belonging to two groups (50 control patients and 50 patients diagnosed with osteoporosis). The descriptors of texture features were based on a gray level histogram, gradient matrix, RL matrix, event matrix, an autoregressive model, and wavelet transformation. From the 290 obtained texture features, the features with fixed values were eliminated and structured according to the feature importance ranking. The classification performance was assessed using 267, 200, 150, 100, 50, 20, and 10 texture features to build classifiers. The classifiers applied in this study included Naive Bayes, Multilayer Perceptron, Hoeffding Tree, K-nearest neighbors, and Random Forest. The following indicators were used to assess the quality of the classifiers: accuracy, sensitivity, specificity, precision, negative predictive value, Matthews correlation coefficient, and F1 score. The highest performance was achieved by the K-Nearest Neighbors (K = 1) and Multilayer Perceptron classifiers. KNN demonstrated the best results with 50 features, attaining a highest F1 score of 96.79% and accuracy (ACC) of 96.75%. MLP achieved its optimal performance with 100 features, reaching an accuracy and F1 score of 96.50%. This demonstrates that building a classifier using a larger number of features, without a selection process, allows us to achieve high classification effectiveness and holds significant diagnostic value. Full article
Show Figures

Figure 1

Figure 1
<p>Illustration of the selection of cancellous bone tissue area for the study (red box).</p>
Full article ">Figure 2
<p>Imaging samples of tissue from healthy patients and tissue with osteoporotic changes in their original size.</p>
Full article ">Figure 3
<p>(<b>a</b>) Visualization of the distribution of feature values: (<b>a</b>) the first in the ranking, (<b>b</b>) the last in the ranking. red—values for osteoporosis tissue, blue—values for healthy tissue.</p>
Full article ">Figure 4
<p>Changes in the values of ACC, TPR, and TNR parameters depending on the number of features for Naive Bayes classification.</p>
Full article ">Figure 5
<p>Changes in the values of ACC, TPR, and TNR parameters depending on the number of features for Multilayer Perceptron classification.</p>
Full article ">Figure 6
<p>Changes in the values of ACC, TPR, and TNR parameters depending on the number of features for Hoeffding Tree classification.</p>
Full article ">Figure 7
<p>Changes in the values of ACC, TPR, and TNR parameters depending on the number of features for K_NN classification.</p>
Full article ">Figure 8
<p>Changes in the values of ACC, TPR, and TNR parameters depending on the number of features for Random Forest classifications.</p>
Full article ">Figure 9
<p>Confusion matrices for (<b>a</b>) K-Nearest Neighbors and (<b>b</b>) Multilayer Perceptron classifiers.</p>
Full article ">
14 pages, 470 KiB  
Article
Speaker Anonymization: Disentangling Speaker Features from Pre-Trained Speech Embeddings for Voice Conversion
by Marco Matassoni, Seraphina Fong and Alessio Brutti
Appl. Sci. 2024, 14(9), 3876; https://doi.org/10.3390/app14093876 - 30 Apr 2024
Cited by 2 | Viewed by 1866
Abstract
Speech is a crucial source of personal information, and the risk of attackers using such information increases day by day. Speaker privacy protection is crucial, and various approaches have been proposed to hide the speaker’s identity. One approach is voice anonymization, which aims [...] Read more.
Speech is a crucial source of personal information, and the risk of attackers using such information increases day by day. Speaker privacy protection is crucial, and various approaches have been proposed to hide the speaker’s identity. One approach is voice anonymization, which aims to safeguard speaker identity while maintaining speech content through techniques such as voice conversion or spectral feature alteration. The significance of voice anonymization has grown due to the necessity to protect personal information in applications such as voice assistants, authentication, and customer support. Building upon the S3PRL-VC toolkit and on pre-trained speech and speaker representation models, this paper introduces a feature disentanglement approach to improve the de-identification performance of the state-of-the-art anonymization approaches based on voice conversion. The proposed approach achieves state-of-the-art speaker de-identification and causes minimal impact on the intelligibility of the signal after conversion. Full article
Show Figures

Figure 1

Figure 1
<p>Preserving privacy strategies in cloud-based speech applications (from [<a href="#B17-applsci-14-03876" class="html-bibr">17</a>]).</p>
Full article ">Figure 2
<p>Voice conversion scheme: the input signal is encoded with a self-supervised learning (SSL) pre-trained model (i.e., WavLM); the conversion model generates a converted representation using this content representation along with a different speaker embedding; and, finally, the vocoder synthesizes the converted waveform.</p>
Full article ">Figure 3
<p>Scheme for the disentanglement mechanism: the original signal and a modified version (with pitch and formant shifts) are used as pairs for training the downstream task (e.g., Automatic Speech Recognition). Aside from the original Connectionist Temporal Classification (CTC) loss, an additional loss based on a cosine similarity on the signal pair induces an additional constraint in the resulting representations, which forces the two representations to be similar and hence progressively reduce the information associated with the speaker characteristics.</p>
Full article ">
19 pages, 3467 KiB  
Article
A Medical Image Encryption Scheme for Secure Fingerprint-Based Authenticated Transmission
by Francesco Castro, Donato Impedovo and Giuseppe Pirlo
Appl. Sci. 2023, 13(10), 6099; https://doi.org/10.3390/app13106099 - 16 May 2023
Cited by 20 | Viewed by 3408
Abstract
Secure transmission of medical images and medical data is essential in healthcare systems, both in telemedicine and AI approaches. The compromise of images and medical data could affect patient privacy and the accuracy of diagnosis. Digital watermarking embeds medical images into a non-significant [...] Read more.
Secure transmission of medical images and medical data is essential in healthcare systems, both in telemedicine and AI approaches. The compromise of images and medical data could affect patient privacy and the accuracy of diagnosis. Digital watermarking embeds medical images into a non-significant image before transmission to ensure visual security. However, it is vulnerable to white-box attacks because the embedded medical image can be extracted by an attacker that knows the system’s operation and does not ensure the authenticity of image transmission. A visually secure image encryption scheme for secure fingerprint-based authenticated transmission has been proposed to solve the above issues. The proposed scheme embeds the encrypted medical image, the encrypted physician’s fingerprint, and the patient health record (EHR) into a non-significant image to ensure integrity, authenticity, and confidentiality during the medical image and medical data transmission. A chaotic encryption algorithm based on a permutation key has been used to encrypt the medical image and fingerprint feature vector. A hybrid asymmetric cryptography scheme based on Elliptic Curve Cryptography (ECC) and AES has been implemented to protect the permutation key. Simulations and comparative analysis show that the proposed scheme achieves higher visual security of the encrypted image and higher medical image reconstruction quality than other secure image encryption approaches. Full article
Show Figures

Figure 1

Figure 1
<p>Visually secure image encryption scheme.</p>
Full article ">Figure 2
<p>Visually secure image decryption scheme.</p>
Full article ">Figure 3
<p>2D-DWT process on reference image.</p>
Full article ">Figure 4
<p>Watermarking process.</p>
Full article ">Figure 5
<p>(<b>a</b>) Hybrid asymmetric encryption scheme; (<b>b</b>) Hybrid asymmetric decryption scheme.</p>
Full article ">Figure 6
<p>(<b>a</b>) medical images to be transmitted and protected; (<b>b</b>) not-significant reference images; (<b>c</b>) visually meaningful encrypted images; (<b>d</b>) reconstructed medical images.</p>
Full article ">Figure 7
<p>Histogram analysis of reference images and their corresponding visually meaningful encrypted images. (<b>a</b>) reference images; (<b>b</b>) histograms of reference images; (<b>c</b>) visually meaningful encrypted images; (<b>d</b>) comparison between the histograms of the encrypted images and the histograms of the reference images.</p>
Full article ">Figure 8
<p>Reconstructed medical images with the random shuffle of mapped vector <math display="inline"><semantics> <mrow> <msub> <mi>s</mi> <mi>m</mi> </msub> </mrow> </semantics></math>.</p>
Full article ">
24 pages, 4172 KiB  
Article
Smartwatch In-Air Signature Time Sequence Three-Dimensional Static Restoration Classification Based on Multiple Convolutional Neural Networks
by Yuheng Guo and Hiroyuki Sato
Appl. Sci. 2023, 13(6), 3958; https://doi.org/10.3390/app13063958 - 20 Mar 2023
Cited by 1 | Viewed by 1811
Abstract
In-air signatures are promising applications that have been investigated extensively in the past decades; an in-air signature involves gathering datasets through portable devices, such as smartwatches. During the signing process, individuals wear smartwatches on their wrists and sign their names in the air. [...] Read more.
In-air signatures are promising applications that have been investigated extensively in the past decades; an in-air signature involves gathering datasets through portable devices, such as smartwatches. During the signing process, individuals wear smartwatches on their wrists and sign their names in the air. The dataset we used in this study collected in-air signatures from 22 participants, resulting in a total of 440 smartwatch in-air signature signals. The dynamic time warping (DTW) algorithm was applied to verify the usability of the dataset. This paper analyzes and compares the performances of multiple convolutional neural networks (CNN) and the transformer using median-sized smartwatch in-air signatures. For the four CNN models, the in-air digital signature data were first transformed into visible three-dimensional static signatures. For the transformer, the nine-dimensional in-air signature signals were concatenated and downsampled to the desired length and then fed into the transformer for time sequence signal multi-classification. The performance of each model on the smartwatch in-air signature dataset was thoroughly tested with respect to 10 optimizers and different learning rates. The best testing performance score in our experiment was 99.8514% with ResNet by using the Adagrad optimizer under a 1×104 learning rate. Full article
Show Figures

Figure 1

Figure 1
<p>Experiment System architecture for a smartwatch’s in-air recognition. Please note the data collection phase using swift was conducted by Li [<a href="#B4-applsci-13-03958" class="html-bibr">4</a>] in his previous research. The convolutional layer experiment section is based on AlexNet [<a href="#B16-applsci-13-03958" class="html-bibr">16</a>], LeNet [<a href="#B15-applsci-13-03958" class="html-bibr">15</a>], VGG [<a href="#B17-applsci-13-03958" class="html-bibr">17</a>], and ResNet [<a href="#B18-applsci-13-03958" class="html-bibr">18</a>]. The transformer experiment section is based on the original transformer paper [<a href="#B11-applsci-13-03958" class="html-bibr">11</a>].</p>
Full article ">Figure 2
<p>Smartwatch accelerometer data of 10 genuine signature signals from the same participant.</p>
Full article ">Figure 3
<p>Dynamic Time Warping Alignment of two signature signals. The blue line and the orange line represent two signatures respectively while the yellow line shows alignment.</p>
Full article ">Figure 4
<p>Dynamic time warping cost matrix of two signature signals.</p>
Full article ">Figure 5
<p>The genuine and forged in-air signature DTW distance comparisons of the smartwatch. PXR represents the DTW distance of the genuine signatures of participant X, and PXF represents the DTW distance of the forgery signatures of participant X. The yellow dots demonstrates the specific DTW alignment costs among the two signatures.</p>
Full article ">Figure 6
<p>Gray-scale in-air signature of a 32 by 32 transformation and the three-channel RGB in-air signature of a 224 by 224 transformation.</p>
Full article ">Figure 7
<p>LeNet-5 structure.</p>
Full article ">Figure 8
<p>AlexNet structure.</p>
Full article ">Figure 9
<p>VGG-16 structure (D configuration).</p>
Full article ">Figure 10
<p>The in-air signature three-dimensional static restoration of a smartwatch.</p>
Full article ">Figure 11
<p>The loss curve of LeNet with respect to 10 different optimizers. Both the learning loss and testing loss for each optimizer are covered in the graph. The two zoom-in areas are (<math display="inline"><semantics> <msub> <mi>x</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>x</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 1000), (<math display="inline"><semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>y</mi> <mn>2</mn> </msub> </semantics></math>) = (3, 4); (<math display="inline"><semantics> <msub> <mi>x</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>x</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 300), (<math display="inline"><semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>y</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 2).</p>
Full article ">Figure 12
<p>The loss curve of AlexNet with respect to 10 different optimizers. Both the learning loss and testing loss for each optimizer are covered in the graph. The zoom-in area is (<math display="inline"><semantics> <msub> <mi>x</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>x</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 100), (<math display="inline"><semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>y</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 4).</p>
Full article ">Figure 13
<p>The loss curve of VGG with respect to 10 different optimizers. Both the learning loss and testing loss for each optimizer are covered in the graph. The zoom-in area is (<math display="inline"><semantics> <msub> <mi>x</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>x</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 100), (<math display="inline"><semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>y</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 4).</p>
Full article ">Figure 14
<p>The loss curve of ResNet with respect to 10 different optimizers. Both the learning loss and testing loss for each optimizer are covered in the graph. The zoom-in area is (<math display="inline"><semantics> <msub> <mi>x</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>x</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 100), (<math display="inline"><semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>y</mi> <mn>2</mn> </msub> </semantics></math>) = (0, 4).</p>
Full article ">Figure 15
<p>The in-air signature nine-dimensional concatenated length of a smartwatch.</p>
Full article ">Figure 16
<p>Transformer over-fitting phenomena on smartwatch dataset in 50 repetitions.</p>
Full article ">Figure 17
<p>Learning curve comparisons for the LeNet training process, LeNet testing process, AlexNet training process, AlexNet testing process, VGG training process, VGG testing process, ResNet training process, and ResNet testing process.</p>
Full article ">Figure 18
<p>Loss curve comparisons for the LeNet training process, LeNet testing process, AlexNet training process, AlexNet testing process, VGG training process, VGG testing process, ResNet training process, and ResNet testing process.</p>
Full article ">
13 pages, 3716 KiB  
Article
Finger Vein and Inner Knuckle Print Recognition Based on Multilevel Feature Fusion Network
by Li Jiang, Xianghuan Liu, Haixia Wang and Dongdong Zhao
Appl. Sci. 2022, 12(21), 11182; https://doi.org/10.3390/app122111182 - 4 Nov 2022
Cited by 8 | Viewed by 2108
Abstract
Multimodal biometric recognition involves two critical issues: feature representation and multimodal fusion. Traditional feature representation requires complex image preprocessing and different feature-extraction methods for different modalities. Moreover, the multimodal fusion methods used in previous work simply splice the features of different modalities, resulting [...] Read more.
Multimodal biometric recognition involves two critical issues: feature representation and multimodal fusion. Traditional feature representation requires complex image preprocessing and different feature-extraction methods for different modalities. Moreover, the multimodal fusion methods used in previous work simply splice the features of different modalities, resulting in an unsatisfactory feature representation. To address these two problems, we propose a Dual-Branch-Net based recognition method with finger vein (FV) and inner knuckle print (IKP). The method combines convolutional neural network (CNN), transfer learning, and triplet loss function to complete feature representation, thereby simplifying and unifying the feature-extraction process of the two modalities. Dual-Branch-Net also achieves deep multilevel fusion of the two modalities’ features. We assess our method on a public FV and IKP homologous multimodal dataset named PolyU-DB. Experimental results show that the proposed method performs best and achieves an equal error rate (EER) of the recognition result of 0.422%. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of the biometric recognition system.</p>
Full article ">Figure 2
<p>FV and IKP fusion network framework.</p>
Full article ">Figure 3
<p>Dual-Branch-Net network structure.</p>
Full article ">Figure 4
<p>Training a network using triplet loss function.</p>
Full article ">Figure 5
<p>Sample images of PolyU-DB. (<b>a</b>) FV images and their corresponding (<b>b</b>) IKP images.</p>
Full article ">Figure 6
<p>ROC curves of ablation experiment results.</p>
Full article ">Figure 7
<p>ROC curves of different recognition schemes with and without pretraining.</p>
Full article ">
Back to TopTop