Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (82)

Search Parameters:
Keywords = CRNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 2404 KiB  
Article
Automated Cough Analysis with Convolutional Recurrent Neural Network
by Yiping Wang, Mustafaa Wahab, Tianqi Hong, Kyle Molinari, Gail M. Gauvreau, Ruth P. Cusack, Zhen Gao, Imran Satia and Qiyin Fang
Bioengineering 2024, 11(11), 1105; https://doi.org/10.3390/bioengineering11111105 - 1 Nov 2024
Viewed by 627
Abstract
Chronic cough is associated with several respiratory diseases and is a significant burden on physical, social, and psychological health. Non-invasive, real-time, continuous, and quantitative monitoring tools are highly desired to assess cough severity, the effectiveness of treatment, and monitor disease progression in clinical [...] Read more.
Chronic cough is associated with several respiratory diseases and is a significant burden on physical, social, and psychological health. Non-invasive, real-time, continuous, and quantitative monitoring tools are highly desired to assess cough severity, the effectiveness of treatment, and monitor disease progression in clinical practice and research. There are currently limited tools to quantitatively measure spontaneous coughs in daily living settings in clinical trials and in clinical practice. In this study, we developed a machine learning model for the detection and classification of cough sounds. Mel spectrograms are utilized as a key feature representation to capture the temporal and spectral characteristics of coughs. We applied this approach to automate cough analysis using 300 h of audio recordings from cough challenge clinical studies conducted in a clinical lab setting. A number of machine learning algorithms were studied and compared, including decision tree, support vector machine, k-nearest neighbors, logistic regression, random forest, and neural network. We identified that for this dataset, the CRNN approach is the most effective method, reaching 98% accuracy in identifying individual coughs from the audio data. These findings provide insights into the strengths and limitations of various algorithms, highlighting the potential of CRNNs in analyzing complex cough patterns. This research demonstrates the potential of neural network models in fully automated cough monitoring. The approach requires validation in detecting spontaneous coughs in patients with refractory chronic cough in a real-life setting. Full article
Show Figures

Figure 1

Figure 1
<p>Examples of cough data in time (<b>a</b>)- and frequency (<b>b</b>–<b>d</b>)-domain formats: (<b>a</b>) 2 min amplitude graph, with coughing events marked by red lines; (<b>b</b>) Mel spectrogram of the same audio segment; (<b>c</b>) 0.5-s audio segment containing a cough occurring between 1217.0 and 1217.3 s; (<b>d</b>) 0.5-s audio segment that does not contain a cough. The color bar shows volum of the recording by decibel from yellow (0 db) to black (80 db).</p>
Full article ">Figure 2
<p>The structure of the CRNN model.</p>
Full article ">Figure 3
<p>Cross-validation results of models trained and tested on different datasets. The metrics used were accuracy, sensitivity, and specificity. Blue represents full data set, orange represents set A, green represents set B and red represents. (<b>a</b>) Accuracy across train-test combinations. (<b>b</b>) Sensitivity across train-test combinations. (<b>c</b>) Specificity across train-test combinations.</p>
Full article ">Figure 4
<p>Training and testing performance metrics of the CRNN model across 50 epochs. Left figure bule line shows the training loss and orange line shows the test loss. Right figure blue line shows the training accuracy, orange shows test accuracy, green line shows the training sensitivity, red line shows the test sensitivity, purple line shows the training specificity and brown line shows the test specificity.</p>
Full article ">Figure 5
<p>ROC curves of all four machine learning modes and the CRNN model. All models were trained and tested on the filtered CCH dataset. The ROC curve of CRNN is from the best mixed-performance model.</p>
Full article ">
20 pages, 3003 KiB  
Article
Equipment Sounds’ Event Localization and Detection Using Synthetic Multi-Channel Audio Signal to Support Collision Hazard Prevention
by Kehinde Elelu, Tuyen Le and Chau Le
Buildings 2024, 14(11), 3347; https://doi.org/10.3390/buildings14113347 - 23 Oct 2024
Viewed by 442
Abstract
Construction workplaces often face unforeseen collision hazards due to a decline in auditory situational awareness among on-foot workers, leading to severe injuries and fatalities. Previous studies that used auditory signals to prevent collision hazards focused on employing a classical beamforming approach to determine [...] Read more.
Construction workplaces often face unforeseen collision hazards due to a decline in auditory situational awareness among on-foot workers, leading to severe injuries and fatalities. Previous studies that used auditory signals to prevent collision hazards focused on employing a classical beamforming approach to determine equipment sounds’ Direction of Arrival (DOA). No existing frameworks implement a neural network-based approach for both equipment sound classification and localization. This paper presents an innovative framework for sound classification and localization using multichannel sound datasets artificially synthesized in a virtual three-dimensional space. The simulation synthesized 10,000 multi-channel datasets using just fourteen single sound source audiotapes. This training includes a two-staged convolutional recurrent neural network (CRNN), where the first stage learns multi-label sound event classes followed by the second stage to estimate their DOA. The proposed framework achieves a low average DOA error of 30 degrees and a high F-score of 0.98, demonstrating accurate localization and classification of equipment near workers’ positions on the site. Full article
(This article belongs to the Special Issue Big Data Technologies in Construction Management)
Show Figures

Figure 1

Figure 1
<p>Multichannel Audio-based Collision Hazard Detection Pipeline.</p>
Full article ">Figure 2
<p>Spectrogram for (<b>Left</b>) Crane—Mobile Equipment, (<b>Right</b>) Saw—Stationary Equipment.</p>
Full article ">Figure 3
<p>Sample simulation setup.</p>
Full article ">Figure 4
<p>Sample scenario of equipment moving toward workers on construction site. One piece of equipment is mobile, moving towards the right (left ball), and another is stationary (right ball). (<b>A</b>) initial position of both pieces of equipment sound, (<b>B</b>) the mobile equipment (left ball) approaches the workers, while the stationary equipment (right ball) remains in place, (<b>C</b>) the mobile equipment is halfway toward the workers, with a potential collision hazard emerging, (<b>D</b>) the mobile equipment reaches its closest point to the workers.</p>
Full article ">Figure 5
<p>Two-Stage Sound Event Detection and Localization Network.</p>
Full article ">Figure 6
<p>SELD Score for Scenarios with both Stationary and Mobile Equipment.</p>
Full article ">Figure 7
<p>SELD Score for Scenarios with Two Concurrent Mobile Equipment.</p>
Full article ">Figure 8
<p>DOA Error Distribution across Different Equipment Types.</p>
Full article ">
13 pages, 3573 KiB  
Review
Cornulin as a Key Diagnostic and Prognostic Biomarker in Cancers of the Squamous Epithelium
by Varun Shankavaram, Dean Shah, Aseel Alashqar, Jackson Sweeney and Hilal Arnouk
Genes 2024, 15(9), 1122; https://doi.org/10.3390/genes15091122 - 26 Aug 2024
Viewed by 1282
Abstract
The prevalence of squamous cell carcinoma is increasing, and efforts that aid in an early and accurate diagnosis are crucial to improve clinical outcomes for patients. Cornulin, a squamous epithelium-specific protein, has recently garnered attention due to its implications in the progression of [...] Read more.
The prevalence of squamous cell carcinoma is increasing, and efforts that aid in an early and accurate diagnosis are crucial to improve clinical outcomes for patients. Cornulin, a squamous epithelium-specific protein, has recently garnered attention due to its implications in the progression of squamous cell carcinoma developed in several tissues. As an epidermal differentiation marker, it is involved in skin anchoring, regulating cellular proliferation, and is a putative tumor suppressor. The physiologically healthy squamous epithelium displays a considerable level of Cornulin, whereas squamous cell carcinomas have marked downregulation, suggesting that Cornulin expression levels can be utilized for the early detection and follow-up on the progression of these types of cancer. Cornulin’s expression patterns in cervical cancer have been examined, and findings support the stepwise downregulation of Cornulin levels that accompanies the progression to neoplasia in the cervix. Additional studies documented a similar trend in expression in other types of cancer, such as cutaneous, esophageal, and oropharyngeal squamous cell carcinomas. The consistent and predictable pattern of Cornulin expression across several squamous cell carcinomas and its correlation with key clinicopathological parameters make it a reliable biomarker for assessing the transformation and progression events in the squamous epithelium, thus potentially contributing to the early detection, definitive diagnosis, and more favorable prognosis for these cancer patients. Full article
(This article belongs to the Special Issue Molecular Diagnostic and Prognostic Markers of Human Cancers)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Schematic graph showing the correlation between the downregulation in Cornulin expression and the progression of oral squamous cell carcinomas from the normal oral mucosa to dysplastic premalignant lesions to invasive phenotypes. Representative immunohistochemistry staining for Cornulin in normal oral mucosa (<b>A</b>), leukoplakia lesion (<b>B</b>), and oral squamous cell carcinoma (<b>C</b>). Similar trends have been documented for cervical and esophageal cancers.</p>
Full article ">Figure 2
<p>Illustration of evaluating the extent of tumor spread and margins using direct visual examination of tumor mass (yellow zone), microscopic examination of histological alterations (green zone), and molecular studies to reveal genetic and proteomic alterations in the precursor fields (red zone) that can suffer malignant transformation leading to local relapses in head and neck cancer patients.</p>
Full article ">Figure 3
<p>Cornulin expression around keratin pearls in well-differentiated cutaneous squamous cell carcinoma tissue samples. Representative images (<b>A</b>) H&amp;E-stained and (<b>B</b>) Immunohistochemistry-stained show intense Cornulin immunoreactivity in the central keratinocytes (green asterisk) adjacent to the keratin pearl (dotted circle), while the peripheral keratinocytes (red asterisk) do not show any detectable levels of Cornulin expression.</p>
Full article ">
21 pages, 4424 KiB  
Article
CSA-SA-CRTNN: A Dual-Stream Adaptive Convolutional Cyclic Hybrid Network Combining Attention Mechanisms for EEG Emotion Recognition
by Ren Qian, Xin Xiong, Jianhua Zhou, Hongde Yu and Kaiwen Sha
Brain Sci. 2024, 14(8), 817; https://doi.org/10.3390/brainsci14080817 - 15 Aug 2024
Viewed by 721
Abstract
In recent years, EEG-based emotion recognition technology has made progress, but there are still problems of low model efficiency and loss of emotional information, and there is still room for improvement in recognition accuracy. To fully utilize EEG’s emotional information and improve recognition [...] Read more.
In recent years, EEG-based emotion recognition technology has made progress, but there are still problems of low model efficiency and loss of emotional information, and there is still room for improvement in recognition accuracy. To fully utilize EEG’s emotional information and improve recognition accuracy while reducing computational costs, this paper proposes a Convolutional-Recurrent Hybrid Network with a dual-stream adaptive approach and an attention mechanism (CSA-SA-CRTNN). Firstly, the model utilizes a CSAM module to assign corresponding weights to EEG channels. Then, an adaptive dual-stream convolutional-recurrent network (SA-CRNN and MHSA-CRNN) is applied to extract local spatial-temporal features. After that, the extracted local features are concatenated and fed into a temporal convolutional network with a multi-head self-attention mechanism (MHSA-TCN) to capture global information. Finally, the extracted EEG information is used for emotion classification. We conducted binary and ternary classification experiments on the DEAP dataset, achieving 99.26% and 99.15% accuracy for arousal and valence in binary classification and 97.69% and 98.05% in ternary classification, and on the SEED dataset, we achieved an accuracy of 98.63%, surpassing relevant algorithms. Additionally, the model’s efficiency is significantly higher than other models, achieving better accuracy with lower resource consumption. Full article
(This article belongs to the Section Neurotechnology and Neuroimaging)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Emotion model: (<b>a</b>) discrete model, (<b>b</b>) two-dimensional valence–arousal model.</p>
Full article ">Figure 2
<p>Frame diagram of the CSA-SA-CRTNN model. The model consists of four modules, namely the CSAM module, the SA-CRNN module, the MHSA-CRNN module, and the MHSA-TCN module.</p>
Full article ">Figure 3
<p>CSAM Structure Diagram.</p>
Full article ">Figure 4
<p>MHSA Structure Diagram.</p>
Full article ">Figure 5
<p>Structure diagram of MHSA-TCN.</p>
Full article ">Figure 6
<p>Accuracy–epoch relationship diagram. (<b>a</b>) DEAP dataset; (<b>b</b>) SEED dataset.</p>
Full article ">Figure 7
<p>The average accuracy of arousal and valence of DEAP using CSA-SA-CRTNN for each subject. (<b>a</b>) 2-class (<b>b</b>) 3-class.</p>
Full article ">Figure 8
<p>Confusion matrix: (<b>a</b>) 2-class arousal; (<b>b</b>) 2-class valence; (<b>c</b>) 3-class arousal; (<b>d</b>) 3-class valence.</p>
Full article ">Figure 9
<p>Experimental results on the SEED dataset: (<b>a</b>) Average accuracy for each subject; (<b>b</b>) confusion matrix.</p>
Full article ">Figure 10
<p>Comparison of attention mechanisms in different channels.</p>
Full article ">
22 pages, 8725 KiB  
Article
Adaptive CAPTCHA: A CRNN-Based Text CAPTCHA Solver with Adaptive Fusion Filter Networks
by Xing Wan, Juliana Johari and Fazlina Ahmat Ruslan
Appl. Sci. 2024, 14(12), 5016; https://doi.org/10.3390/app14125016 - 8 Jun 2024
Cited by 1 | Viewed by 1600
Abstract
Text-based CAPTCHAs remain the most widely adopted security scheme, which is the first barrier to securing websites. Deep learning methods, especially Convolutional Neural Networks (CNNs), are the mainstream approach for text CAPTCHA recognition and are widely used in CAPTCHA vulnerability assessment and data [...] Read more.
Text-based CAPTCHAs remain the most widely adopted security scheme, which is the first barrier to securing websites. Deep learning methods, especially Convolutional Neural Networks (CNNs), are the mainstream approach for text CAPTCHA recognition and are widely used in CAPTCHA vulnerability assessment and data collection. However, verification code recognizers are mostly deployed on the CPU platform as part of a web crawler and security assessment; they are required to have both low complexity and high recognition accuracy. Due to the specifically designed anti-attack mechanisms like noise, interference, geometric deformation, twisting, rotation, and character adhesion in text CAPTCHAs, some characters are difficult to efficiently identify with high accuracy in these complex CAPTCHA images. This paper proposed a recognition model named Adaptive CAPTCHA with a CNN combined with an RNN (CRNN) module and trainable Adaptive Fusion Filtering Networks (AFFN), which effectively handle the interference and learn the correlation between characters in CAPTCHAs to enhance recognition accuracy. Experimental results on two datasets of different complexities show that, compared with the baseline model Deep CAPTCHA, the number of parameters of our proposed model is reduced by about 70%, and the recognition accuracy is improved by more than 10 percentage points in the two datasets. In addition, the proposed model has a faster training convergence speed. Compared with several of the latest models, the model proposed by the study also has better comprehensive performance. Full article
(This article belongs to the Special Issue Advanced Technologies in Data and Information Security III)
Show Figures

Figure 1

Figure 1
<p>The Network of Deep CAPTCHA.</p>
Full article ">Figure 2
<p>Some confusing adjacent characters in CAPTCHAs.</p>
Full article ">Figure 3
<p>The networks of Adaptive CAPTCHA.</p>
Full article ">Figure 4
<p>Samples of dataset: (<b>a</b>) M-CAPTCHA; (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 5
<p>Character statistical distributions: (<b>a</b>) M-CAPTCHA; (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 6
<p>Markov transition probabilities between characters on the M-CAPTCHA.</p>
Full article ">Figure 7
<p>The structure of AFFN.</p>
Full article ">Figure 8
<p>The training process of Alpha.</p>
Full article ">Figure 9
<p>The structure of CRNN.</p>
Full article ">Figure 10
<p>ASR with and without filter networks on the M-CAPTCHA.</p>
Full article ">Figure 11
<p>ASR with and without filter networks on the P-CAPTCHA.</p>
Full article ">Figure 12
<p>Comparison of images before and after filtering: (<b>a</b>) M-CAPTCHA; (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 13
<p>Loss comparison before and after filtering: (<b>a</b>) M-CAPTCHA; (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 14
<p>AASR with different filter units on the P–dataset.</p>
Full article ">Figure 15
<p>AASR comparison using FC and CRNN: (<b>a</b>) M-CAPTCHA; (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 16
<p>AASR comparison with and without BN in CRNN.</p>
Full article ">Figure 17
<p>AASR with different layers of LSTM on P–dataset.</p>
Full article ">Figure 18
<p>AASR with different residual connections: (<b>a</b>) M-CAPTCHA (<b>b</b>) P-CAPTCHA.</p>
Full article ">Figure 19
<p>AASR with different loss functions.</p>
Full article ">Figure 20
<p>AASR confusion matrix of Adaptive CAPTCHA on M–dataset.</p>
Full article ">Figure 21
<p>AASR of different models.</p>
Full article ">
16 pages, 4245 KiB  
Article
CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction
by Wentao Zhu, Huanzeng Xie, Yaowen Chen and Guishan Zhang
Int. J. Mol. Sci. 2024, 25(8), 4429; https://doi.org/10.3390/ijms25084429 - 17 Apr 2024
Cited by 1 | Viewed by 1699
Abstract
CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. [...] Read more.
CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. Here, we proposed a hybrid neural network named CrnnCrispr, which integrates a convolutional neural network and a recurrent neural network for on-target activity prediction. We performed unbiased experiments with four mainstream methods on nine public datasets with varying sample sizes. Additionally, we incorporated a transfer learning strategy to boost the prediction power on small-scale datasets. Our results showed that CrnnCrispr outperformed existing methods in terms of accuracy and generalizability. Finally, we applied a visualization approach to investigate the generalizable nucleotide-position-dependent patterns of sgRNAs for on-target activity, which shows potential in terms of model interpretability and further helps in understanding the principles of sgRNA design. Full article
Show Figures

Figure 1

Figure 1
<p>The heatmap shows (<b>a</b>) mean SCC and (<b>b</b>) mean PCC values of CrnnCrispr and four compared methods on nine datasets with three scales, including large-scale, medium-scale and small-scale datasets. The prediction methods are placed vertically, whereas the test datasets are arranged horizontally. Test datasets are classified by sample size.</p>
Full article ">Figure 2
<p>Performance comparison of CrnnCrispr training from scratch and transfer learning on three small-scale datasets (e.g., HCT116, HELA and HL60) under 5-fold cross-validation.</p>
Full article ">Figure 3
<p>Performance comparison in terms of SCC of CrnnCrispr and four existing deep-learning-based methods on nine datasets with various scales under a leave-one-cell-out procedure.</p>
Full article ">Figure 4
<p>Impact of nucleotide composition of sgRNA activity on three large-scale datasets. Bars show the Z-scores of nucleotide frequency for each position. The numbers below represent the positions of the sequence.</p>
Full article ">Figure 5
<p>Illustration of the CrnnCrispr architecture. The sgRNA was first encoded by one-hot encoding and label encoding and was subsequently used as input of the CNN branch and BiGRU branch, respectively. The outputs of these two branches were concatenated and fed into two LSTM layers for dimensionality reduction. The outputs were flattened and input into three fully connected layers to generate the final representation. The outputs of the final fully connected layer were fed into a linear regression transformation to make a prediction of sgRNA on-target activity.</p>
Full article ">
20 pages, 15351 KiB  
Article
Intelligent Analysis System for Teaching and Learning Cognitive Engagement Based on Computer Vision in an Immersive Virtual Reality Environment
by Ce Li, Li Wang, Quanzhi Li and Dongxuan Wang
Appl. Sci. 2024, 14(8), 3149; https://doi.org/10.3390/app14083149 - 9 Apr 2024
Viewed by 1039
Abstract
The 20th National Congress of the Communist Party of China and the 14th Five Year Plan for Education Informatization focus on digital technology and intelligent learning and implement innovation-driven education environment reform. An immersive virtual reality (IVR) environment has both immersive and interactive [...] Read more.
The 20th National Congress of the Communist Party of China and the 14th Five Year Plan for Education Informatization focus on digital technology and intelligent learning and implement innovation-driven education environment reform. An immersive virtual reality (IVR) environment has both immersive and interactive characteristics, which are an important way of virtual learning and are also one of the important ways in which to promote the development of smart education. Based on the above background, this article proposes an intelligent analysis system for Teaching and Learning Cognitive engagement in an IVR environment based on computer vision. By automatically analyzing the cognitive investment of students in the IVR environment, it is possible to better understand their learning status, provide personalized guidance to improve learning quality, and thereby promote the development of smart education. This system uses Vue (developed by Evan You, located in Wuxi, China) and ECharts (Developed by Baidu, located in Beijing, China) for visual display, and the algorithm uses the Pytorch framework (Developed by Facebook, located in Silicon Valley, CA, USA), YOLOv5 (Developed by Ultralytics, located in Washington, DC, USA), and the CRNN model (Convolutional Recurrent Neural Network) to monitor and analyze the visual attention and behavioral actions of students. Through this system, a more accurate analysis of learners’ cognitive states and personalized teaching support can be provided for the education field, providing certain technical support for the development of smart education. Full article
Show Figures

Figure 1

Figure 1
<p>Overall system design architecture diagram.</p>
Full article ">Figure 2
<p>Preprocessing steps diagram.</p>
Full article ">Figure 3
<p>Preprocessing process diagram. The Chinese word in the picture says “red blood cell”.</p>
Full article ">Figure 4
<p>Preprocessing result image. The Chinese word in the picture says “red blood cell”.</p>
Full article ">Figure 5
<p>YOLOv5 architecture diagram.</p>
Full article ">Figure 6
<p>C3 module structure diagram.</p>
Full article ">Figure 7
<p>Bottleneck module architecture diagram.</p>
Full article ">Figure 8
<p>SPP module architecture diagram.</p>
Full article ">Figure 9
<p>IOU calculation chart.</p>
Full article ">Figure 10
<p>Changes in NMS processing. The Chinese word in the picture says “red blood cell”.</p>
Full article ">Figure 11
<p>Text OCR layer flowchart.</p>
Full article ">Figure 12
<p>CRNN model structure diagram.</p>
Full article ">Figure 13
<p>Network input image. The Chinese word in the picture says “red blood cell”.</p>
Full article ">Figure 14
<p>Text recognition process diagram. The Chinese word in the picture says “red blood cell”.</p>
Full article ">Figure 15
<p>Video frame cutting results.</p>
Full article ">Figure 16
<p>Video frame cutting results.</p>
Full article ">Figure 17
<p>YOLOv5 module detection results. The Chinese words in the picture say “red blood cell” and “mitochondria”.</p>
Full article ">Figure 18
<p>YOLOv5 module detection results. The Chinese word in the picture says “vesica”.</p>
Full article ">Figure 19
<p>Text OCR detection module detection effect diagram. The Chinese words in the picture say “microtubule” and “white blood cell”.</p>
Full article ">Figure 20
<p>Integration process diagram of detection data.</p>
Full article ">Figure 21
<p>Comparison chart between system detection of various target objects and the actual frame rate.</p>
Full article ">Figure 22
<p>Accuracy chart of system detection for various target objects.</p>
Full article ">Figure 23
<p>Visualization page diagram.</p>
Full article ">Figure 24
<p>Visualization of IVR videos and learner videos.</p>
Full article ">Figure 25
<p>Select statistical object visualization.</p>
Full article ">Figure 26
<p>Experimental flow chart.</p>
Full article ">
17 pages, 8563 KiB  
Article
Research on the Vision-Based Dairy Cow Ear Tag Recognition Method
by Tianhong Gao, Daoerji Fan, Huijuan Wu, Xiangzhong Chen, Shihao Song, Yuxin Sun and Jia Tian
Sensors 2024, 24(7), 2194; https://doi.org/10.3390/s24072194 - 29 Mar 2024
Cited by 1 | Viewed by 1470
Abstract
With the increase in the scale of breeding at modern pastures, the management of dairy cows has become much more challenging, and individual recognition is the key to the implementation of precision farming. Based on the need for low-cost and accurate herd management [...] Read more.
With the increase in the scale of breeding at modern pastures, the management of dairy cows has become much more challenging, and individual recognition is the key to the implementation of precision farming. Based on the need for low-cost and accurate herd management and for non-stressful and non-invasive individual recognition, we propose a vision-based automatic recognition method for dairy cow ear tags. Firstly, for the detection of cow ear tags, the lightweight Small-YOLOV5s is proposed, and then a differentiable binarization network (DBNet) combined with a convolutional recurrent neural network (CRNN) is used to achieve the recognition of the numbers on ear tags. The experimental results demonstrated notable improvements: Compared to those of YOLOV5s, Small-YOLOV5s enhanced recall by 1.5%, increased the mean average precision by 0.9%, reduced the number of model parameters by 5,447,802, and enhanced the average prediction speed for a single image by 0.5 ms. The final accuracy of the ear tag number recognition was an impressive 92.1%. Moreover, this study introduces two standardized experimental datasets specifically designed for the ear tag detection and recognition of dairy cows. These datasets will be made freely available to researchers in the global dairy cattle community with the intention of fostering intelligent advancements in the breeding industry. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

Figure 1
<p>Some samples of data from CEID-D. Capture angles: frontal, lateral, and overhead views of cows. Weather conditions during shooting: overcast and sunny days. Captured cow poses: standing, feeding, and lying down.</p>
Full article ">Figure 2
<p>Ear tag image quality assessment.</p>
Full article ">Figure 3
<p>Preprocessing of ear tag images. From left to right, the original ear tag, the ear tag after bilateral filtering, the ear tag after edge sharpening, and the ear tag after grayscaling.</p>
Full article ">Figure 4
<p>Ear tag images annotated with Paddlelabel.</p>
Full article ">Figure 5
<p>Technology Roadmap.</p>
Full article ">Figure 6
<p>The structure of YOLOV5s.</p>
Full article ">Figure 7
<p>The structure of Small-YOLOV5s.</p>
Full article ">Figure 8
<p>The structure of CA.</p>
Full article ">Figure 9
<p>The structure of DBNet.</p>
Full article ">Figure 10
<p>The structure of the CRNN.</p>
Full article ">Figure 11
<p>Comparison of cow ear tag detection results. (<b>a</b>) The results of ear tag detection using the color threshold method, with the original image on the left and the detection results on the right. (<b>b</b>,<b>c</b>) The detection results of cow ear tags in different scenarios using Small-YOLOV5s.</p>
Full article ">Figure 12
<p>Loss decay and recognition accuracy in CRNN training.</p>
Full article ">
18 pages, 3564 KiB  
Article
Offline Mongolian Handwriting Recognition Based on Data Augmentation and Improved ECA-Net
by Qing-Dao-Er-Ji Ren, Lele Wang, Zerui Ma and Saheya Barintag
Electronics 2024, 13(5), 835; https://doi.org/10.3390/electronics13050835 - 21 Feb 2024
Viewed by 940
Abstract
Writing is an important carrier of cultural inheritance, and the digitization of handwritten texts is an effective means to protect national culture. Compared to Chinese and English handwriting recognition, the research on Mongolian handwriting recognition started relatively late and achieved few results due [...] Read more.
Writing is an important carrier of cultural inheritance, and the digitization of handwritten texts is an effective means to protect national culture. Compared to Chinese and English handwriting recognition, the research on Mongolian handwriting recognition started relatively late and achieved few results due to the characteristics of the script itself and the lack of corpus. First, according to the characteristics of Mongolian handwritten characters, the random erasing data augmentation algorithm was modified, and a dual data augmentation (DDA) algorithm was proposed by combining the improved algorithm with horizontal wave transformation (HWT) to augment the dataset for training the Mongolian handwriting recognition. Second, the classical CRNN handwriting recognition model was improved. The structure of the encoder and decoder was adjusted according to the characteristics of the Mongolian script, and the attention mechanism was introduced in the feature extraction and decoding stages of the model. An improved handwriting recognition model, named the EGA model, suitable for the features of Mongolian handwriting was suggested. Finally, the effectiveness of the EGA model was verified by a large number of data tests. Experimental results demonstrated that the proposed EGA model improves the recognition accuracy of Mongolian handwriting, and the structural modification of the encoder and coder effectively balances the recognition accuracy and complexity of the model. Full article
(This article belongs to the Special Issue Deep Learning in Image Processing and Pattern Recognition)
Show Figures

Figure 1

Figure 1
<p>Experimental results of Horizontal Wave Transformation. (<b>a</b>) The original image; (<b>b</b>) the image processed by HWTDA with <span class="html-italic">R</span> = 5, 15, 25, 35, and 45 from left to right.</p>
Full article ">Figure 2
<p>Experimental test of Improved Random Erasing Data Augmentation. (<b>a</b>) The original image; (<b>b</b>) the original REDA algorithm processes images through rectangular areas; (<b>c</b>) the improved REDA algorithm processes images through elliptical areas.</p>
Full article ">Figure 3
<p>Flow of the Dual Data Augmentation Algorithm.</p>
Full article ">Figure 4
<p>EGA Model Training Process.</p>
Full article ">Figure 5
<p>Network LSTM and Network GRU Structure Comparison. (<b>a</b>) LSTM Network Structure; (<b>b</b>) GRU Network Structure.</p>
Full article ">Figure 6
<p>Network Structure of the EGA Model.</p>
Full article ">Figure 7
<p>Offline Mongolian Handwriting Images.</p>
Full article ">Figure 8
<p>Effect of Dual Data Augmentation. (<b>a</b>) The original image; (<b>b</b>) the image processed only by HWTDA; (<b>c</b>) the image processed by DDA.</p>
Full article ">Figure 9
<p>CRNN Model Recognition Accuracy Curve before and after Dual Data Augmentation.</p>
Full article ">Figure 10
<p>Model Loss Value Curve before and after Dual Data Augmentation.</p>
Full article ">Figure 11
<p>EGA Model Loss Value Curve.</p>
Full article ">Figure 12
<p>Comparison of Recognition Accuracy between EGA Model and CRNN Model.</p>
Full article ">
24 pages, 1950 KiB  
Article
Optimal Training Dataset Preparation for AI-Supported Multilanguage Real-Time OCRs Using Visual Methods
by Attila Biró, Sándor Miklós Szilágyi and László Szilágyi
Appl. Sci. 2023, 13(24), 13107; https://doi.org/10.3390/app132413107 - 8 Dec 2023
Viewed by 1540
Abstract
In the realm of multilingual, AI-powered, real-time optical character recognition systems, this research explores the creation of an optimal, vocabulary-based training dataset. This comprehensive endeavor seeks to encompass a range of criteria: comprehensive language representation, high-quality and diverse data, balanced datasets, contextual understanding, [...] Read more.
In the realm of multilingual, AI-powered, real-time optical character recognition systems, this research explores the creation of an optimal, vocabulary-based training dataset. This comprehensive endeavor seeks to encompass a range of criteria: comprehensive language representation, high-quality and diverse data, balanced datasets, contextual understanding, domain-specific adaptation, robustness and noise tolerance, and scalability and extensibility. The approach aims to leverage techniques like convolutional neural networks, recurrent neural networks, convolutional recurrent neural networks, and single visual models for scene text recognition. While focusing on English, Hungarian, and Japanese as representative languages, the proposed methodology can be extended to any existing or even synthesized languages. The development of accurate, efficient, and versatile OCR systems is at the core of this research, offering societal benefits by bridging global communication gaps, ensuring reliability in diverse environments, and demonstrating the adaptability of AI to evolving needs. This work not only mirrors the state of the art in the field but also paves new paths for future innovation, accentuating the importance of sustained research in advancing AI’s potential to shape societal development. Full article
Show Figures

Figure 1

Figure 1
<p>Adjusted PaddleOCR architecture (adapted from [<a href="#B19-applsci-13-13107" class="html-bibr">19</a>]).</p>
Full article ">Figure 2
<p>Result of data imbalance mitigation.</p>
Full article ">Figure 3
<p>Dataset preparation for real-time OCR (adapted from [<a href="#B42-applsci-13-13107" class="html-bibr">42</a>,<a href="#B47-applsci-13-13107" class="html-bibr">47</a>]).</p>
Full article ">Figure 4
<p>Sobel filter on data generation: (<b>a</b>) in the case of single-line text; (<b>b</b>) in the case of multiline text.</p>
Full article ">Figure 5
<p>Distribution of text lengths in the train dataset.</p>
Full article ">Figure 6
<p>Character number distribution.</p>
Full article ">Figure 7
<p>Text length distribution—experiment 1 (15M_enhujp_v2_1): English–Hungarian–Japanese distribution.</p>
Full article ">Figure 8
<p>Text length distribution—experiment 1 (30M_enhujp_v2_4): English–Hungarian–Japanese distribution.</p>
Full article ">Figure 9
<p>Text length distribution—experiment 1 (50M_enhujp_v2_2): English–Hungarian–Japanese distribution.</p>
Full article ">
21 pages, 1406 KiB  
Article
Contactless Heart and Respiration Rates Estimation and Classification of Driver Physiological States Using CW Radar and Temporal Neural Networks
by Amal El Abbaoui, David Sodoyer and Fouzia Elbahhar
Sensors 2023, 23(23), 9457; https://doi.org/10.3390/s23239457 - 28 Nov 2023
Cited by 2 | Viewed by 1540
Abstract
The measurement and analysis of vital signs are a subject of significant research interest, particularly for monitoring the driver’s physiological state, which is of crucial importance for road safety. Various approaches have been proposed using contact techniques to measure vital signs. However, all [...] Read more.
The measurement and analysis of vital signs are a subject of significant research interest, particularly for monitoring the driver’s physiological state, which is of crucial importance for road safety. Various approaches have been proposed using contact techniques to measure vital signs. However, all of these methods are invasive and cumbersome for the driver. This paper proposes using a non-contact sensor based on continuous wave (CW) radar at 24 GHz to measure vital signs. We associate these measurements with distinct temporal neural networks to analyze the signals to detect and extract heart and respiration rates as well as classify the physiological state of the driver. This approach offers robust performance in estimating the exact values of heart and respiration rates and in classifying the driver’s physiological state. It is non-invasive and requires no physical contact with the driver, making it particularly practical and safe. The results presented in this paper, derived from the use of a 1D Convolutional Neural Network (1D-CNN), a Temporal Convolutional Network (TCN), a Recurrent Neural Network particularly the Bidirectional Long Short-Term Memory (Bi-LSTM), and a Convolutional Recurrent Neural Network (CRNN). Among these, the CRNN emerged as the most effective Deep Learning approach for vital signal analysis. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

Figure 1
<p>The general architecture of the proposed models.</p>
Full article ">Figure 2
<p>A dilated causal convolution with dilation factors d = 1, 2, 4, 8, 16, and 32 and a filter size k = 3.</p>
Full article ">Figure 3
<p>Fundamental mechanism of CW radar.</p>
Full article ">Figure 4
<p>Loss function of regression models.</p>
Full article ">Figure 5
<p>Loss function of classification models.</p>
Full article ">Figure 6
<p>Confusion Matrix for Each Model Using the Simulated Dataset, Dependent on Individual variances.</p>
Full article ">Figure 7
<p>Comparative accuracy curves: predicting heart and respiration rates based on physiological state.</p>
Full article ">Figure 8
<p>Confusion Matrix for Each Model Using the Simulated Dataset, Independent of Individual Variances.</p>
Full article ">Figure 9
<p>Loss function of classification models.</p>
Full article ">Figure 10
<p>Loss function of regression models.</p>
Full article ">Figure 11
<p>Confusion matrix for each model Using the Real Dataset, Independent of Individual Variances.</p>
Full article ">
20 pages, 51538 KiB  
Article
Deep-Learning-Based Annotation Extraction Method for Chinese Scanned Maps
by Xun Rao, Jiasheng Wang, Wenjing Ran, Mengzhu Sun and Zhe Zhao
ISPRS Int. J. Geo-Inf. 2023, 12(10), 422; https://doi.org/10.3390/ijgi12100422 - 14 Oct 2023
Viewed by 1761
Abstract
One of a map’s fundamental elements is its annotations, and extracting these annotations is an important step in enabling machine intelligence to understand scanned map data. Due to the complexity of the characters and lines, extracting annotations from scanned Chinese maps is difficult, [...] Read more.
One of a map’s fundamental elements is its annotations, and extracting these annotations is an important step in enabling machine intelligence to understand scanned map data. Due to the complexity of the characters and lines, extracting annotations from scanned Chinese maps is difficult, and there is currently little research in this area. A deep-learning-based framework for extracting annotations from scanned Chinese maps is presented in the paper. Improved the EAST annotation detection model and CRNN annotation recognition model based on transfer learning make up the two primary parts of this framework. Several sets of the comparative tests for annotation detection and recognition were created in order to assess the efficacy of this method for extracting annotations from scanned Chinese maps. The experimental findings show the following: (i) The suggested annotation detection approach in this study revealed precision, recall, and h-mean values of 0.8990, 0.8389, and 0.8635, respectively. These measures demonstrate improvements over the currently popular models of −0.0354 to 0.0907, 0.0131 to 0.2735, and 0.0467 to 0.1919, respectively. (ii) The proposed annotation recognition method in this study revealed precision, recall, and h-mean values of 0.9320, 0.8956, and 0.9134, respectively. These measurements demonstrate improvements over the currently popular models of 0.0294 to 0.1049, 0.0498 to 0.1975, and 0.0402 to 0.1582, respectively. Full article
Show Figures

Figure 1

Figure 1
<p>Comparison of real scanned map and simulated map styles.</p>
Full article ">Figure 2
<p>Flowchart of the simulation map generator.</p>
Full article ">Figure 3
<p>Methodology framework diagram.</p>
Full article ">Figure 4
<p>Improved AdvancedEAST notation detection model structure.</p>
Full article ">Figure 5
<p>Resblock structure diagram.</p>
Full article ">Figure 6
<p>ASF structure diagram.</p>
Full article ">Figure 7
<p>Transfer learning process diagram.</p>
Full article ">Figure 8
<p>Annotation detection results for different annotation styles.</p>
Full article ">Figure 9
<p>Annotation detection results for different annotation styles.</p>
Full article ">Figure 10
<p>Annotation detection results with different map background interference.</p>
Full article ">Figure 11
<p>Annotation recognition results with different background interference.</p>
Full article ">Figure 12
<p>Chinese scanned map annotation detection results. The red box indicates a correct detection, the blue box indicates an incorrect detection, and the green box indicates a missed detection.</p>
Full article ">Figure 13
<p>Comparison of detection results of different detection models. The red box indicates a correct detection, the blue box indicates an incorrect detection, and the green box indicates a missed detection.</p>
Full article ">Figure 14
<p>Chinese scanned map annotation recognition results.</p>
Full article ">
15 pages, 4129 KiB  
Article
Detection and Recognition of Tilted Characters on Railroad Wagon Wheelsets Based on Deep Learning
by Fengxia Xu, Zhenyang Xu, Zhongda Lu, Chuanshui Peng and Shiwei Yan
Sensors 2023, 23(18), 7716; https://doi.org/10.3390/s23187716 - 7 Sep 2023
Viewed by 1077
Abstract
The quality of railroad wheelsets is an important guarantee for the safe operation of wagons, and mastering the production information of wheelsets plays a vital role in vehicle scheduling and railroad transportation safety. However, when using objection detection methods to detect the production [...] Read more.
The quality of railroad wheelsets is an important guarantee for the safe operation of wagons, and mastering the production information of wheelsets plays a vital role in vehicle scheduling and railroad transportation safety. However, when using objection detection methods to detect the production information of wheelsets, there are situations that affect detection such as character tilting and unfixed position. Therefore, this paper proposes a deep learning-based method for accurately detecting and recognizing tilted character information on railroad wagon wheelsets. It covers three parts. Firstly, we construct a tilted character detection network based on Faster RCNN for generating a wheelset’s character candidate regions. Secondly, we design a tilted character correction network to classify and correct the orientation of flipped characters. Finally, a character recognition network is constructed based on convolutional recurrent neural network (CRNN) to realize the task of recognizing a wheelset’s characters. The result shows that the method can quickly and effectively detect and identify the information of tilted characters on wheelsets in images. Full article
(This article belongs to the Special Issue Vision Sensors: Image Processing Technologies and Applications)
Show Figures

Figure 1

Figure 1
<p>Overall system network architecture.</p>
Full article ">Figure 2
<p>Feature extraction network architecture.</p>
Full article ">Figure 3
<p><span class="html-italic">DIOU</span> structure schematic.</p>
Full article ">Figure 4
<p>MobileNetV2 and inverted residual network architecture.</p>
Full article ">Figure 5
<p>ECA attention module.</p>
Full article ">Figure 6
<p>Structure of the Bagging algorithm.</p>
Full article ">Figure 7
<p>Recurrent layer network architecture.</p>
Full article ">Figure 8
<p>Training results of network. (<b>a</b>) shows the ACC and loss for detecting network training, (<b>b</b>) shows the mAP for detecting network training, (<b>c</b>) shows the Accuracy and loss for correction network.</p>
Full article ">Figure 9
<p>Overall effectiveness of network detection and recognition. The output forms of the characters in the figure are all pictures. There are a few Chinese characters, for example, “左” and “厂” represent the meaning of “left” and “factory” respectively.</p>
Full article ">
15 pages, 4376 KiB  
Article
Exploratory Study Analyzing the Urinary Peptidome of T2DM Patients Suggests Changes in ECM but Also Inflammatory and Metabolic Pathways Following GLP-1R Agonist Treatment
by Sonnal Lohia, Justyna Siwy, Emmanouil Mavrogeorgis, Susanne Eder, Stefanie Thöni, Gert Mayer, Harald Mischak, Antonia Vlahou and Vera Jankowski
Int. J. Mol. Sci. 2023, 24(17), 13540; https://doi.org/10.3390/ijms241713540 - 31 Aug 2023
Viewed by 1575
Abstract
Type II diabetes mellitus (T2DM) accounts for approximately 90% of all diabetes mellitus cases in the world. Glucagon-like peptide-1 receptor (GLP-1R) agonists have established an increased capability to target directly or indirectly six core defects associated with T2DM, while the underlying molecular mechanisms [...] Read more.
Type II diabetes mellitus (T2DM) accounts for approximately 90% of all diabetes mellitus cases in the world. Glucagon-like peptide-1 receptor (GLP-1R) agonists have established an increased capability to target directly or indirectly six core defects associated with T2DM, while the underlying molecular mechanisms of these pharmacological effects are not fully known. This exploratory study was conducted to analyze the effect of treatment with GLP-1R agonists on the urinary peptidome of T2DM patients. Urine samples of thirty-two T2DM patients from the PROVALID study (“A Prospective Cohort Study in Patients with T2DM for Validation of Biomarkers”) collected pre- and post-treatment with GLP-1R agonist drugs were analyzed by CE-MS. In total, 70 urinary peptides were significantly affected by GLP-1R agonist treatment, generated from 26 different proteins. The downregulation of MMP proteases, based on the concordant downregulation of urinary collagen peptides, was highlighted. Treatment also resulted in the downregulation of peptides from SERPINA1, APOC3, CD99, CPSF6, CRNN, SERPINA6, HBA2, MB, VGF, PIGR, and TTR, many of which were previously found to be associated with increased insulin resistance and inflammation. The findings indicate potential molecular mechanisms of GLP-1R agonists in the context of the management of T2DM and the prevention or delaying of the progression of its associated diseases. Full article
(This article belongs to the Special Issue Advances in the Pathogenesis of Diabetic Kidney Disease)
Show Figures

Figure 1

Figure 1
<p>Study design. Urine samples from thirty-two T2DM patients were collected at two time points: pre-treatment and post-treatment with the intervention of GLP-1R agonists at 4.4 ± 4.11 months from the first sample collection. Naturally occurring urinary peptides were quantified in the urine samples by CE-MS analysis, followed by statistical and bioinformatic analysis of the generated urinary peptide profiles.</p>
Full article ">Figure 2
<p>Results of the urinary peptidomic analysis. (<b>a</b>) Distribution of peptide intensity for all the 329 sequenced urinary peptides identified in this study; red dots indicate the statistically significant peptides. (<b>b</b>) Volcano plot depicting the regulation of the 329 peptides in response to GLP-1R agonist treatment. (<span class="html-italic">Green dots represent the significantly downregulated peptides; red the significantly upregulated peptides; and gray the non-significant peptides</span>). (<b>c</b>) Urinary CE-MS peptide profiles of the 70 significant peptides during pre-treatment and (<b>d</b>) post-treatment.</p>
Full article ">Figure 3
<p><span class="html-italic">COL1A1</span> and <span class="html-italic">COL3A1</span> peptides. (<b>a</b>) Box and Whisker plots depicting the down-regulation of all the significant <span class="html-italic">COL1A1</span> peptides in response to GLP-1R agonist treatment. (<b>b</b>) Box and Whisker plots depicting the differential abundance of the <span class="html-italic">COL3A1</span> peptides in response to GLP-1R agonist treatment. (<b>c</b>) Alignment of the identified peptide sequences in the primary structure of protein <span class="html-italic">COL1A1</span>. (<b>d</b>) Alignment of the identified peptide sequences in the primary structure of protein <span class="html-italic">COL3A1</span>. In (<b>c</b>,<b>d</b>), <span class="html-italic">the amino acids in green and red depict the down- and up-regulated peptide sequences, respectively</span>.</p>
Full article ">Figure 4
<p>Protein-protein interactome. The network was constructed based on the 26 parental proteins of the 70 GLP-1R agonist-associated urinary peptides.</p>
Full article ">Figure 5
<p>Hypothesis. The beneficial effects of GLP-1R agonist treatment on the different pathophysiological pathways associated with T2DM as suggested by the down-regulated non-collagen peptides, (respective protein names are shown) in each pathway.</p>
Full article ">Figure 6
<p>Flowchart summarizing the selection of 32 T2DM patients treated with GLP-1R agonists from the PROVALID study. * PROVALID was an observational study; therefore, patients were treated the way their physicians thought it was appropriate. For this study, we selected individuals that did not receive a GLP-1R agonist at pre-treatment urine sampling and were administered a GLP-1R agonist at post-treatment urine sampling.</p>
Full article ">
25 pages, 1968 KiB  
Article
A Three-Stage Uyghur Recognition Model Combining the Attention Mechanism and Different Convolutional Recurrent Networks
by Wentao Li, Yuduo Zhang, Yongdong Huang, Yue Shen and Zhe Wang
Appl. Sci. 2023, 13(17), 9539; https://doi.org/10.3390/app13179539 - 23 Aug 2023
Cited by 1 | Viewed by 1515
Abstract
Uyghur text recognition faces several challenges in the field due to the scarcity of publicly available datasets and the intricate nature of the script characterized by strong ligatures and unique attributes. In this study, we propose a unified three-stage model for Uyghur language [...] Read more.
Uyghur text recognition faces several challenges in the field due to the scarcity of publicly available datasets and the intricate nature of the script characterized by strong ligatures and unique attributes. In this study, we propose a unified three-stage model for Uyghur language recognition. The model is developed using a self-constructed Uyghur text dataset, enabling evaluation of previous Uyghur text recognition modules as well as exploration of novel module combinations previously unapplied to Uyghur text recognition, including Convolutional Recurrent Neural Networks (CRNNs), Gated Recurrent Convolutional Neural Networks (GRCNNs), ConvNeXt, and attention mechanisms. Through a comprehensive analysis of the accuracy, time, normalized edit distance, and memory requirements of different module combinations on a consistent training and evaluation dataset, we identify the most suitable text recognition structure for Uyghur text. Subsequently, utilizing the proposed approach, we train the model weights and achieve optimal recognition of Uyghur text using the ConvNeXt+Bidirectional LSTM+attention mechanism structure, achieving a notable accuracy of 90.21%. These findings demonstrate the strong generalization and high precision exhibited by Uyghur text recognition based on the proposed model, thus establishing its potential practical applications in Uyghur text recognition. Full article
Show Figures

Figure 1

Figure 1
<p>Three-stage Uyghur identification structure.</p>
Full article ">Figure 2
<p>VGG feature extraction network.</p>
Full article ">Figure 3
<p>GRCNN feature extraction network.</p>
Full article ">Figure 4
<p>(<b>a</b>) ResNet and (<b>b</b>) ConvNeXt feature extraction network block.</p>
Full article ">Figure 5
<p>Deep bidirectional LSTM network.</p>
Full article ">Figure 6
<p>Connectionist temporal classification (CTC).</p>
Full article ">Figure 7
<p>Attention mechanism (Attn).</p>
Full article ">Figure 8
<p>Data enhancement of Uyghur images.</p>
Full article ">Figure 9
<p>Example of a computer-cut Uyghur words. (<b>a</b>) Sample print template data. (<b>b</b>) Sample hand-drawn template data. (<b>c</b>) Sample electronic template data.</p>
Full article ">Figure 10
<p>Example of manual adjustment data.</p>
Full article ">Figure 11
<p>Example of a computer-cut Uyghur words.</p>
Full article ">Figure 12
<p>Sample data after trimming the edges.</p>
Full article ">Figure 13
<p>Data pre-processing steps. (<b>a</b>) Original Picture. (<b>b</b>) Padding. (<b>c</b>) Uniform Scale. (<b>d</b>) Grayscale.</p>
Full article ">Figure 14
<p>Figure showing the effect of data augmentation via different methods. (<b>a</b>) Original picture. (<b>b</b>) Stochastic affine transformation. (<b>c</b>) Gaussian noise. (<b>d</b>) Elastic transformation.</p>
Full article ">Figure 15
<p>Saliency Map of the whole word for predicting the degree of influence.</p>
Full article ">Figure 16
<p>Saliency Maps for each character for predicting the degree of influence.</p>
Full article ">Figure 17
<p>Local Interpretable Model-Agnostic Explanations.</p>
Full article ">Figure 18
<p>Shapley additive explanation.</p>
Full article ">Figure 19
<p>Feature maps for partial networks.</p>
Full article ">Figure 20
<p>Accuracy—parametric chart.</p>
Full article ">Figure 21
<p>Accuracy—time chart.</p>
Full article ">Figure 22
<p>CTC versus Attn. (Number of parameters.)</p>
Full article ">Figure 23
<p>CTC versus Attn. (Model testing time.)</p>
Full article ">Figure 24
<p>Comparison of feature extraction networks. (Number of parameters.)</p>
Full article ">Figure 25
<p>Comparison of feature extraction networks. (Model testing time.)</p>
Full article ">Figure A1
<p>Different writing styles of handwritten Uyghur words.</p>
Full article ">Figure A1 Cont.
<p>Different writing styles of handwritten Uyghur words.</p>
Full article ">
Back to TopTop