Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (987)

Search Parameters:
Keywords = facial expression

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 3658 KiB  
Article
Change and Detection of Emotions Expressed on People’s Faces in Photos
by Zbigniew Piotrowski, Maciej Kaczyński and Tomasz Walczyna
Appl. Sci. 2024, 14(22), 10681; https://doi.org/10.3390/app142210681 - 19 Nov 2024
Viewed by 80
Abstract
Human emotions are an element of attention in various areas of interest such as psychology, marketing, medicine, and public safety. Correctly detecting human emotions is a complex matter. The more complex and visually similar emotions are, the more difficult they become to distinguish. [...] Read more.
Human emotions are an element of attention in various areas of interest such as psychology, marketing, medicine, and public safety. Correctly detecting human emotions is a complex matter. The more complex and visually similar emotions are, the more difficult they become to distinguish. Making visual modifications to the faces of people in photos in a way that changes the perceived emotion while preserving the characteristic features of the original face is one of the areas of research in deepfake technologies. The aim of this article is to showcase the outcomes of computer simulation experiments that utilize artificial intelligence algorithms to change the emotions on people’s faces. In order to detect and change emotions, deep neural networks discussed further in this article were used. Full article
(This article belongs to the Special Issue Machine Perception and Learning)
Show Figures

Figure 1

Figure 1
<p>Wheel of emotion [<a href="#B3-applsci-14-10681" class="html-bibr">3</a>].</p>
Full article ">Figure 2
<p>Circumplex theory of affect [<a href="#B3-applsci-14-10681" class="html-bibr">3</a>].</p>
Full article ">Figure 3
<p>EmoDNN emotion change preview.</p>
Full article ">Figure 4
<p>Confusion matrices of trained classifiers (from left based on PyTorch; from right based on TensorFlow).</p>
Full article ">Figure 5
<p>Confusion matrices of trained classifiers of generated faces with changed emotion (from left based on PyTorch; from right based on TensorFlow).</p>
Full article ">Figure A1
<p>Preview of sample generated images for individual emotions (viewed from the top, the rows represent different emotions; viewed from the left, the consecutive columns represent pairs of images: [original image, image with changed emotion generated by EmoDNN]).</p>
Full article ">Figure A1 Cont.
<p>Preview of sample generated images for individual emotions (viewed from the top, the rows represent different emotions; viewed from the left, the consecutive columns represent pairs of images: [original image, image with changed emotion generated by EmoDNN]).</p>
Full article ">
28 pages, 6900 KiB  
Article
A New Approach to Recognize Faces Amidst Challenges: Fusion Between the Opposite Frequencies of the Multi-Resolution Features
by Regina Lionnie, Julpri Andika and Mudrik Alaydrus
Algorithms 2024, 17(11), 529; https://doi.org/10.3390/a17110529 - 17 Nov 2024
Viewed by 473
Abstract
This paper proposes a new approach to pixel-level fusion using the opposite frequency from the discrete wavelet transform with Gaussian or Difference of Gaussian. The low-frequency from discrete wavelet transform sub-band was fused with the Difference of Gaussian, while the high-frequency sub-bands were [...] Read more.
This paper proposes a new approach to pixel-level fusion using the opposite frequency from the discrete wavelet transform with Gaussian or Difference of Gaussian. The low-frequency from discrete wavelet transform sub-band was fused with the Difference of Gaussian, while the high-frequency sub-bands were fused with Gaussian. The final fusion was reconstructed using an inverse discrete wavelet transform into one enhanced reconstructed image. These enhanced images were utilized to improve recognition performance in the face recognition system. The proposed method was tested against benchmark face datasets such as The Database of Faces (AT&T), the Extended Yale B Face Dataset, the BeautyREC Face Dataset, and the FEI Face Dataset. The results showed that our proposed method was robust and accurate against challenges such as lighting conditions, facial expressions, head pose, 180-degree rotation of the face profile, dark images, acquisition with time gap, and conditions where the person uses attributes such as glasses. The proposed method is comparable to state-of-the-art methods and generates high recognition performance (more than 99% accuracy). Full article
(This article belongs to the Section Algorithms for Multidisciplinary Applications)
Show Figures

Figure 1

Figure 1
<p>Examples of images inside each dataset: (<b>a</b>) AT&amp;T [<a href="#B40-algorithms-17-00529" class="html-bibr">40</a>], (<b>b</b>) BeautyREC [<a href="#B41-algorithms-17-00529" class="html-bibr">41</a>], (<b>c</b>) EYB [<a href="#B42-algorithms-17-00529" class="html-bibr">42</a>,<a href="#B43-algorithms-17-00529" class="html-bibr">43</a>], (<b>d</b>) EYB-Dark [<a href="#B42-algorithms-17-00529" class="html-bibr">42</a>,<a href="#B43-algorithms-17-00529" class="html-bibr">43</a>], (<b>e</b>) FEI [<a href="#B44-algorithms-17-00529" class="html-bibr">44</a>], (<b>f</b>) FEI-FE [<a href="#B44-algorithms-17-00529" class="html-bibr">44</a>].</p>
Full article ">Figure 2
<p>The flowchart of our proposed method.</p>
Full article ">Figure 3
<p>The MRA-DWT sub-bands (from <b>left</b> to <b>right</b>): approximation, horizontal, vertical, diagonal sub-bands with Haar and one level of decomposition.</p>
Full article ">Figure 4
<p>The illustration of the scaling function (<b>left</b>) and wavelet function (<b>right</b>) from the Haar wavelet.</p>
Full article ">Figure 5
<p>Results from Gaussian filtering and the Difference of Gaussian (from <b>left</b> to <b>right</b>): original image, Gaussian filtered image with <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>σ</mi> </mrow> <mrow> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math>, Gaussian filtered image with <span class="html-italic">σ</span><sub>2</sub>, Difference of Gaussian.</p>
Full article ">Figure 6
<p>Example of results from proposed fusion (from <b>top</b> to <b>bottom</b>): <span class="html-italic">AL</span>, <span class="html-italic">HG</span>, <span class="html-italic">VG</span>, <span class="html-italic">DG</span> with image fusion DWT/IDWT-IF using the mean-mean rule.</p>
Full article ">Figure 7
<p>The comparison of processing times for the AT&amp;T Face Dataset; Exp. 5; Exp. 6 using <span class="html-italic">db2</span> in DWT/IDWT-IF with levels of decomposition: one (Exp. 6a); three (Exp. 6b); five (Exp. 6c); and seven (Exp. 6d).</p>
Full article ">Figure 8
<p>Accuracy results (%) for the AT&amp;T Face Dataset (proposed method) using different wavelet families in MRA-DWT/IDWT with one level of decomposition: (<b>a</b>) Experiment 5; (<b>b</b>) Experiment 6.</p>
Full article ">Figure 9
<p>Accuracy results (%) for AT&amp;T Face Dataset from Experiment 6 (proposed method) using <span class="html-italic">db2</span> wavelet in DWT/IDWT-IF and <span class="html-italic">bior3.3</span> in MRA-DWT/IDWT with variations in the level of decomposition.</p>
Full article ">Figure 10
<p>Accuracy results (%) for AT&amp;T Face Dataset from Experiment 6 (proposed method) using various wavelet families in DWT/IDWT-IF with five levels of decomposition and <span class="html-italic">bior3.3</span> in MRA-DWT/IDWT.</p>
Full article ">Figure 11
<p>Accuracy results (%) for the EYB Face Dataset for Experiments 2, 4, 5, and 6.</p>
Full article ">Figure 12
<p>Accuracy results (%) for the EYB-Dark Face Dataset for Experiments 2, 4, 5, and 6.</p>
Full article ">Figure 13
<p>Accuracy results (%) for the EYB-Dark Face Dataset for Experiment 6 using fusion rules: mean-mean, min-max, and max-min.</p>
Full article ">Figure 14
<p>Fusion results of DWT/IDWT-IF with d2 and five levels of decomposition (from left to right) top: original image, using min-max rule, max-min rule, and mean-mean rule; bottom: fusion results but scaled based on the pixel value range.</p>
Full article ">Figure 15
<p>Accuracy results (%) for the EYB-Dark Face Dataset for Experiment 6 with the mean-mean fusion rule using different wavelet families for MRA-DWT/IDWT.</p>
Full article ">Figure 16
<p>Accuracy results (%) for the BeautyREC Dataset from Exp. 5 and 6 with variations of employing 1820 images and all (3000) images.</p>
Full article ">Figure 17
<p>Accuracy results (%) for the BeautyREC Dataset: Exp. 5, LP-IF with MRA-DWT/IDWT (a) <span class="html-italic">haar</span>, (b) <span class="html-italic">db2</span>, (c) <span class="html-italic">sym2</span>, (d) <span class="html-italic">bior2.6</span>, (e) <span class="html-italic">bior3.3</span>; Exp. 6, DWT/IDWT-IF with MRA-DWT/IDWT (a) <span class="html-italic">haar</span>, (b) <span class="html-italic">db2</span>, (c) <span class="html-italic">sym2</span>, (d) <span class="html-italic">bior2.6</span>, (e) <span class="html-italic">bior3.3</span>; Exp. 6, DWT/IDWT-IF with <span class="html-italic">haar</span> for MRA-DWT/IDWT and <span class="html-italic">db2</span> wavelet with total level of decomposition (f) one, (g) three, (h) seven; Exp. 6, DWT/IDWT-IF with <span class="html-italic">haar</span> for MRA-DWT/IDWT and five levels of decomposition using wavelets (i) <span class="html-italic">haar</span>, (j) <span class="html-italic">sym2</span>, (k) <span class="html-italic">bior 2.6</span>; Exp. 6, DWT/IDWT-IF using fusion rule (l) min-max, (m) max-min. All results came from SVM with the cubic kernel.</p>
Full article ">Figure 18
<p>Example of high variations for one person inside the BeautyREC Face Dataset.</p>
Full article ">Figure 19
<p>Accuracy results (%) for the FEI Face Database from Exp. 5 and 6.</p>
Full article ">Figure 20
<p>Accuracy results (%) for the FEI-FE Face Database from Exp. 5 and 6.</p>
Full article ">
22 pages, 2740 KiB  
Article
Unsupervised Canine Emotion Recognition Using Momentum Contrast
by Aarya Bhave, Alina Hafner, Anushka Bhave and Peter A. Gloor
Sensors 2024, 24(22), 7324; https://doi.org/10.3390/s24227324 - 16 Nov 2024
Viewed by 283
Abstract
We describe a system for identifying dog emotions based on dogs’ facial expressions and body posture. Towards that goal, we built a dataset with 2184 images of ten popular dog breeds, grouped into seven similarly sized primal mammalian emotion categories defined by neuroscientist [...] Read more.
We describe a system for identifying dog emotions based on dogs’ facial expressions and body posture. Towards that goal, we built a dataset with 2184 images of ten popular dog breeds, grouped into seven similarly sized primal mammalian emotion categories defined by neuroscientist and psychobiologist Jaak Panksepp as ‘Exploring’, ‘Sadness’, ‘Playing’, ‘Rage’, ‘Fear’, ‘Affectionate’ and ‘Lust’. We modified the contrastive learning framework MoCo (Momentum Contrast for Unsupervised Visual Representation Learning) to train it on our original dataset and achieved an accuracy of 43.2% and a baseline of 14%. We also trained this model on a second publicly available dataset that resulted in an accuracy of 48.46% but had a baseline of 25%. We compared our unsupervised approach with a supervised model based on a ResNet50 architecture. This model, when tested on our dataset with the seven Panksepp labels, resulted in an accuracy of 74.32% Full article
(This article belongs to the Special Issue Integrated Sensor Systems for Multi-modal Emotion Recognition)
Show Figures

Figure 1

Figure 1
<p>Breed and emotion distribution of images in dataset.</p>
Full article ">Figure 2
<p>Examples of seven emotional behaviors for Golden Retriever breed.</p>
Full article ">Figure 3
<p>An image of a labrador puppy displaying the ‘Exploring’ behavior augmented according to the above-mentioned augmentations (image source: Wikipedia).</p>
Full article ">Figure 4
<p>Contrastive learning MoCo framework for dog emotion recognition using ResNet encoders.</p>
Full article ">Figure 5
<p>Test accuracy and training loss results for ResNet18 encoder with 1200 epochs.</p>
Full article ">Figure 6
<p>Test accuracy and training loss results for ResNet18 encoder with 1800 epochs.</p>
Full article ">Figure 7
<p>Test accuracy and training loss results for ResNet34 encoder with 1200 epochs.</p>
Full article ">Figure 8
<p>Test accuracy and training loss results for ResNet34 encoder with 1800 epochs.</p>
Full article ">Figure 9
<p>Unsupervised and supervised model comparison.</p>
Full article ">
23 pages, 5517 KiB  
Article
Research on an Eye Control Method Based on the Fusion of Facial Expression and Gaze Intention Recognition
by Xiangyang Sun and Zihan Cai
Appl. Sci. 2024, 14(22), 10520; https://doi.org/10.3390/app142210520 - 15 Nov 2024
Viewed by 296
Abstract
With the deep integration of psychology and artificial intelligence technology and other related technologies, eye control technology has achieved certain results at the practical application level. However, it is found that the accuracy of the current single-modal eye control technology is still not [...] Read more.
With the deep integration of psychology and artificial intelligence technology and other related technologies, eye control technology has achieved certain results at the practical application level. However, it is found that the accuracy of the current single-modal eye control technology is still not high, which is mainly caused by the inaccurate eye movement detection caused by the high randomness of eye movements in the process of human–computer interaction. Therefore, this study will propose an intent recognition method that fuses facial expressions and eye movement information and expects to complete an eye control method based on the fusion of facial expression and eye movement information based on the multimodal intent recognition dataset, including facial expressions and eye movement information constructed in this study. Based on the self-attention fusion strategy, the fused features are calculated, and the multi-layer perceptron is used to classify the fused features, so as to realize the mutual attention between different features, and improve the accuracy of intention recognition by enhancing the weight of effective features in a targeted manner. In order to solve the problem of inaccurate eye movement detection, an improved YOLOv5 model was proposed, and the accuracy of the model detection was improved by adding two strategies: a small target layer and a CA attention mechanism. At the same time, the corresponding eye movement behavior discrimination algorithm was combined for each eye movement action to realize the output of eye behavior instructions. Finally, the experimental verification of the eye–computer interaction scheme combining the intention recognition model and the eye movement detection model showed that the accuracy of the eye-controlled manipulator to perform various tasks could reach more than 95 percent based on this scheme. Full article
Show Figures

Figure 1

Figure 1
<p>The technical route of this paper’s research.</p>
Full article ">Figure 2
<p>Face image dataset example.</p>
Full article ">Figure 3
<p>This eye movement intent detection flow chart describes the conversion of eye movement data to intent classification.</p>
Full article ">Figure 4
<p>Integration framework based on attention mechanism.</p>
Full article ">Figure 5
<p>Comparison of performance in single-mode and multimodal prediction.</p>
Full article ">Figure 6
<p>Line charts of five indicators of different models.</p>
Full article ">Figure 7
<p>Loss function curve of Anchor method before and after improvement.</p>
Full article ">Figure 8
<p>Structure diagram of the CA attention mechanism [<a href="#B9-applsci-14-10520" class="html-bibr">9</a>].</p>
Full article ">Figure 9
<p>Improved YOLOv5 model structure.</p>
Full article ">Figure 10
<p>Improved loss variation diagram for the YOLOv5 model.</p>
Full article ">Figure 10 Cont.
<p>Improved loss variation diagram for the YOLOv5 model.</p>
Full article ">Figure 11
<p>The average accuracy (AP) curve of the improved model.</p>
Full article ">Figure 12
<p>The F1 score curve of the improved model.</p>
Full article ">Figure 13
<p>Test results before and after improvement.</p>
Full article ">Figure 14
<p>Human–computer interaction experiment platform.</p>
Full article ">Figure 15
<p>The overall flow chart of the experiment.</p>
Full article ">Figure 16
<p>Comparison of calculation efficiency indicators.</p>
Full article ">Figure 17
<p>Complete human–computer interaction process.</p>
Full article ">Figure 18
<p>Test results.</p>
Full article ">Figure 19
<p>Test results for different tasks.</p>
Full article ">
19 pages, 3590 KiB  
Article
Multi-Head Attention Affinity Diversity Sharing Network for Facial Expression Recognition
by Caixia Zheng, Jiayu Liu, Wei Zhao, Yingying Ge and Wenhe Chen
Electronics 2024, 13(22), 4410; https://doi.org/10.3390/electronics13224410 - 11 Nov 2024
Viewed by 432
Abstract
Facial expressions exhibit inherent similarities, variability, and complexity. In real-world scenarios, challenges such as partial occlusions, illumination changes, and individual differences further complicate the task of facial expression recognition (FER). To further improve the accuracy of FER, a Multi-head Attention Affinity and Diversity [...] Read more.
Facial expressions exhibit inherent similarities, variability, and complexity. In real-world scenarios, challenges such as partial occlusions, illumination changes, and individual differences further complicate the task of facial expression recognition (FER). To further improve the accuracy of FER, a Multi-head Attention Affinity and Diversity Sharing Network (MAADS) is proposed in this paper. MAADS comprises a Feature Discrimination Network (FDN), an Attention Distraction Network (ADN), and a Shared Fusion Network (SFN). To be specific, FDN first integrates attention weights into the objective function to capture the most discriminative features by using the proposed sparse affinity loss. Then, ADN employs multiple parallel attention networks to maximize diversity within spatial attention units and channel attention units, which guides the network to focus on distinct, non-overlapping facial regions. Finally, SFN deconstructs facial features into generic parts and unique parts, which allows the network to learn the distinctions between these features without having to relearn complete features from scratch. To validate the effectiveness of the proposed method, extensive experiments were conducted on several widely used in-the-wild datasets including RAF-DB, AffectNet-7, AffectNet-8, FERPlus, and SFEW. MAADS achieves the accuracy of 92.93%, 67.14%, 64.55%, 91.58%, and 62.41% on these datasets, respectively. The experimental results indicate that MAADS not only outperforms current state-of-the-art methods in recognition accuracy but also has a relatively low computational complexity. Full article
Show Figures

Figure 1

Figure 1
<p>Overview of our proposed MAADS method.</p>
Full article ">Figure 2
<p>The structure of the attention head.</p>
Full article ">Figure 3
<p>Illustration of diversity loss.</p>
Full article ">Figure 4
<p>Confusion matrices of the different datasets.</p>
Full article ">Figure 5
<p>Attention maps are visualized by using GradCAM++ tool [<a href="#B61-electronics-13-04410" class="html-bibr">61</a>]. (<b>a</b>) Original image, (<b>b</b>) attention map obtained by using single attention head, (<b>c</b>–<b>f</b>) attention maps obtained by using multiple attention heads.</p>
Full article ">Figure 6
<p>Precision–recall curves for each class on RAF-DB, AffectNet-7, and AffectNet-8.</p>
Full article ">Figure 7
<p>The impact of the number of attention heads on the RAF-DB dataset.</p>
Full article ">
17 pages, 1880 KiB  
Article
Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
by Kamil Skowroński, Adam Gałuszka and Eryka Probierz
Appl. Sci. 2024, 14(22), 10284; https://doi.org/10.3390/app142210284 - 8 Nov 2024
Viewed by 414
Abstract
Emotion recognition by social robots is a serious challenge because sometimes people also do not cope with it. It is important to use information about emotions from all possible sources: facial expression, speech, or reactions occurring in the body. Therefore, a multimodal emotion [...] Read more.
Emotion recognition by social robots is a serious challenge because sometimes people also do not cope with it. It is important to use information about emotions from all possible sources: facial expression, speech, or reactions occurring in the body. Therefore, a multimodal emotion recognition system was introduced, which includes the indicated sources of information and deep learning algorithms for emotion recognition. An important part of this system includes the speech analysis module, which was decided to be divided into two tracks: speech and text. An additional condition is the target language of communication, Polish, for which the number of datasets and methods is very limited. The work shows that emotion recognition using a single source—text or speech—can lead to low accuracy of the recognized emotion. It was therefore decided to compare English and Polish datasets and the latest deep learning methods in speech emotion recognition using Mel spectrograms. The most accurate LSTM models were evaluated on the English set and the Polish nEMO set, demonstrating high efficiency of emotion recognition in the case of Polish data. The conducted research is a key element in the development of a decision-making algorithm for several emotion recognition modules in a multimodal system. Full article
Show Figures

Figure 1

Figure 1
<p>Mel spectrograms created for each category from nEMO dataset.</p>
Full article ">Figure 2
<p>Architecture of multimodal emotion analysis system.</p>
Full article ">Figure 3
<p>Confusion matrices for speech emotion analysis models based on the English-language RAVDESS dataset.</p>
Full article ">Figure 4
<p>Confusion matrices for emotion analysis models in speech based on the Polish-language nEMO dataset.</p>
Full article ">
26 pages, 4018 KiB  
Article
A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia
by Ioannis Galanakis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis and Maria Samarakou
Appl. Sci. 2024, 14(22), 10266; https://doi.org/10.3390/app142210266 - 7 Nov 2024
Viewed by 568
Abstract
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, [...] Read more.
This paper introduces a classification model that detects and classifies argumentative behaviors between two individuals by utilizing a machine learning application, based on the MediaPipe Holistic model. The approach involves the distinction between two different classes based on the behavior of two individuals, argumentative and non-argumentative behaviors, corresponding to verbal argumentative behavior. By using a dataset extracted from video frames of hand gestures, body stance and facial expression, and by using their corresponding landmarks, three different classification models were trained and evaluated. The results indicate that Random Forest Classifier outperformed the other two by classifying argumentative behaviors with 68.07% accuracy and non-argumentative behaviors with 94.18% accuracy, correspondingly. Thus, there is future scope for advancing this classification model to a prediction model, with the aim of predicting aggressive behavior in patients suffering with dementia before their onset. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

Figure 1
<p>Argumentative image dataset sample.</p>
Full article ">Figure 2
<p>Non-argumentative image dataset sample.</p>
Full article ">Figure 3
<p>Cross-validation metrics for the three models.</p>
Full article ">Figure 4
<p>AUC scores of the three trained models. A model that makes random guesses (practically a model with no discriminative power), is represented by the diagonal dashed blue line that extends from the bottom left (0, 0) to the top right (1, 1). The ROC curve for any model that outperforms the random one will be above this diagonal line.</p>
Full article ">Figure 5
<p>Confusion matrix of Random Forest Classifier after training.</p>
Full article ">Figure 6
<p>Confusion matrix of Gradient Boosting after training.</p>
Full article ">Figure 7
<p>Confusion matrix of Ridge Classifier after training.</p>
Full article ">Figure 8
<p>Learning curve of Random Forest Classifier after training.</p>
Full article ">Figure 9
<p>Learning curve for Gradient Boosting after training.</p>
Full article ">Figure 10
<p>Learning curve of Ridge Classifier after training.</p>
Full article ">Figure 11
<p>Paired <span class="html-italic">t</span>-test statistic results across all models and metrics.</p>
Full article ">Figure 12
<p>Confusion Matrix of Random Forest Classifier after testing.</p>
Full article ">Figure 13
<p>ROC AUC score of Random Forest Classifier after testing.</p>
Full article ">Figure 14
<p>Final model evaluation metrics.</p>
Full article ">Figure 15
<p>Probability range/count of correct argumentative and non-argumentative predictions per 0.1 accuracy range, with 1.0 being the perfect accuracy score.</p>
Full article ">
23 pages, 4732 KiB  
Article
Enhancing Real-Time Emotion Recognition in Classroom Environments Using Convolutional Neural Networks: A Step Towards Optical Neural Networks for Advanced Data Processing
by Nuphar Avital, Idan Egel, Ido Weinstock and Dror Malka
Inventions 2024, 9(6), 113; https://doi.org/10.3390/inventions9060113 - 4 Nov 2024
Viewed by 543
Abstract
In contemporary academic settings, end-of-semester student feedback on a lecturer’s teaching abilities often fails to provide a comprehensive, real-time evaluation of their proficiency, and becomes less relevant with each new cohort of students. To address these limitations, an innovative feedback method has been [...] Read more.
In contemporary academic settings, end-of-semester student feedback on a lecturer’s teaching abilities often fails to provide a comprehensive, real-time evaluation of their proficiency, and becomes less relevant with each new cohort of students. To address these limitations, an innovative feedback method has been proposed, utilizing image processing algorithms to dynamically assess the emotional states of students during lectures by analyzing their facial expressions. This real-time approach enables lecturers to promptly adapt and enhance their teaching techniques. Recognizing and engaging with emotionally positive students has been shown to foster better learning outcomes, as their enthusiasm actively stimulates cognitive engagement and information analysis. The purpose of this work is to identify emotions based on facial expressions using a deep learning model based on a convolutional neural network (CNN), where facial recognition is performed using the Viola–Jones algorithm on a group of students in a learning environment. The algorithm encompasses four key steps: image acquisition, preprocessing, emotion detection, and emotion recognition. The technological advancement of this research lies in the proposal to implement photonic hardware and create an optical neural network which offers unparalleled speed and efficiency in data processing. This approach demonstrates significant advancements over traditional electronic systems in handling computational tasks. An experimental validation was conducted in a classroom with 45 students, demonstrating that the level of understanding in the class as predicted was 43–62.94%, and the proposed CNN algorithm (facial expressions detection) achieved an impressive 83% accuracy in understanding students’ emotional states. The correlation between the CNN deep learning model and the students’ feedback was 91.7%. This novel approach opens avenues for the real-time assessment of students’ engagement levels and the effectiveness of the learning environment, providing valuable insights for ongoing improvements in teaching practices. Full article
Show Figures

Figure 1

Figure 1
<p>Illustration of the emotion recognition model in virtual environment.</p>
Full article ">Figure 2
<p>Example of input and integral image metrics.</p>
Full article ">Figure 3
<p>Schematic sketch of a four-layer neural network illustration: (<b>a</b>) neural network; (<b>b</b>) optical neural network.</p>
Full article ">Figure 4
<p>Block diagram of an emotion recognition system and level of understanding on a facial expression screen.</p>
Full article ">Figure 5
<p>CNN model block diagram: (<b>a</b>) feature map size; (<b>b</b>) block diagram of the CNN model.</p>
Full article ">Figure 6
<p>CNN model accuracy as a function of iterations.</p>
Full article ">Figure 7
<p>Model accuracy as a function of the number of convolutional layers.</p>
Full article ">Figure 8
<p>Illustration of comprehension level over time: (<b>a</b>) student A; (<b>b</b>) student B.</p>
Full article ">Figure 9
<p>Frequency of specific emotions observed throughout the lesson, categorized according to the levels of understanding: (<b>a</b>) student A; (<b>b</b>) student B.</p>
Full article ">Figure 10
<p>Image samples with corresponding face detection box: (<b>a</b>) student A; (<b>b</b>) student B.</p>
Full article ">Figure 10 Cont.
<p>Image samples with corresponding face detection box: (<b>a</b>) student A; (<b>b</b>) student B.</p>
Full article ">Figure 11
<p>Comparison of comprehension levels between the automatic CNN model and manual feedback.</p>
Full article ">
13 pages, 1972 KiB  
Article
FaceReader Insights into the Emotional Response of Douro Wines
by Catarina Marques and Alice Vilela
Appl. Sci. 2024, 14(21), 10053; https://doi.org/10.3390/app142110053 - 4 Nov 2024
Viewed by 752
Abstract
Understanding consumers’ emotional responses to wine is essential for improving marketing strategies and product development. Emotions play a pivotal role in shaping consumer preferences. This study investigates the emotional reactions elicited by different types of Douro wines (white, red, and Port) through facial [...] Read more.
Understanding consumers’ emotional responses to wine is essential for improving marketing strategies and product development. Emotions play a pivotal role in shaping consumer preferences. This study investigates the emotional reactions elicited by different types of Douro wines (white, red, and Port) through facial expression analysis using FaceReader software, version 9.0 (Noldus Information Technology, Wageningen, The Netherlands). A total of 80 participants tasted six wine samples, and their facial expressions were recorded and analyzed. FaceReader quantified the intensity of emotions such as happiness, sadness, anger, surprise, fear, and disgust. Arousal levels were also assessed. The results were analyzed through principal component analysis (PCA) to identify patterns and groupings based on emotional responses. White wines evoked more sadness due to their acidity, while red wines were associated with lower levels of sadness and greater comfort. Port wines elicited surprise, probably due to their sweet and fortified nature. Additionally, female participants showed consistently higher arousal levels than males across all wine types. The study highlights distinct emotional profiles for each type of wine and suggests that demographic factors, such as gender, influence emotional responses. These insights can inform targeted marketing and enhance the consumer experience through better alignment of wine characteristics with emotional engagement. Full article
Show Figures

Figure 1

Figure 1
<p>FaceReader screenshot recording the emotion “surprised”. Image retrieved from <a href="https://www.noldus.com/facereader/set-up" target="_blank">https://www.noldus.com/facereader/set-up</a>, accessed on 11 September 2024.</p>
Full article ">Figure 2
<p>Emotional profile elicited by the wine. (<b>a</b>) WW1; (<b>b</b>) WW2; (<b>c</b>) RW1; (<b>d</b>) RW2; (<b>e</b>) PW1; and (<b>f</b>) PW2.</p>
Full article ">Figure 3
<p>Arousal elicited by the wine per gender. (<b>a</b>) WW1; (<b>b</b>) WW2; (<b>c</b>) RW1; (<b>d</b>) RW2; (<b>e</b>) PW1; and (<b>f</b>) PW2.</p>
Full article ">Figure 3 Cont.
<p>Arousal elicited by the wine per gender. (<b>a</b>) WW1; (<b>b</b>) WW2; (<b>c</b>) RW1; (<b>d</b>) RW2; (<b>e</b>) PW1; and (<b>f</b>) PW2.</p>
Full article ">Figure 4
<p>An explanatory graphic obtained after PCA shows that Douro’s white, red, and Port wines (green squares) align according to the emotions they elicit (green triangles).</p>
Full article ">
13 pages, 2697 KiB  
Article
Unilateral “Inactive” Condylar Hyperplasia: New Histological Data
by Michele Runci Anastasi, Antonio Centofanti, Angelo Favaloro, Josè Freni, Fabiana Nicita, Giovanna Vermiglio, Giuseppe Pio Anastasi and Piero Cascone
J. Funct. Morphol. Kinesiol. 2024, 9(4), 217; https://doi.org/10.3390/jfmk9040217 - 2 Nov 2024
Viewed by 337
Abstract
Background: Unilateral condylar hyperplasia (UCH) is characterized by slow progression and enlargement of the condyle, accompanied by elongation of the mandibular body, resulting in facial asymmetry, occlusal disharmony, and joint dysfunction. This condition can be defined as “active” or “inactive”: the active form [...] Read more.
Background: Unilateral condylar hyperplasia (UCH) is characterized by slow progression and enlargement of the condyle, accompanied by elongation of the mandibular body, resulting in facial asymmetry, occlusal disharmony, and joint dysfunction. This condition can be defined as “active” or “inactive”: the active form is characterized by continuous growth and dynamic histologic changes, whereas the inactive form indicates that the growth process has stabilized. Since there are few microscopic studies on the inactive form, this study aims to investigate the histological features and expression of key proteins and bone markers in patients diagnosed with inactive UCH. Methods: A total of 15 biopsies from patients aged 28 to 36 years were examined by light microscopy and immunofluorescence for collagen I and II, metalloproteinases 2 (MMP-2) and 9 (MMP-9), receptor activator of nuclear factor- kappa B (RANK), and osteocalcin. Results: Our findings indicate that during inactive UCH, the ongoing process is not entirely stopped, with moderate expression of collagen, metalloproteinases, RANK, and osteocalcin, although no cartilage islands are detectable. Conclusions: The present study shows that even if these features are moderate when compared to active UCH and without cartilage islands, inactive UCH could be characterized by borderline features that could represent an important trigger-point to possible reactivation, or they could represent a long slow progression that is not “self-limited”. Full article
(This article belongs to the Section Functional Anatomy and Musculoskeletal System)
Show Figures

Figure 1

Figure 1
<p>Surgical procedure pictures: (<b>A</b>) pretargic preauricular incision; (<b>B</b>) exposure of the joint capsule; (<b>C</b>) lateral ligament incision; (<b>D</b>) sliced condylectomy; (<b>E</b>) final sutures.</p>
Full article ">Figure 2
<p>Compound panel of Hematoxylin–Eosin stained sections of UCH (<b>A</b>–<b>F</b>). Pictures show a thick hyperplastic layer ((<b>A</b>,<b>B</b>) white arrows) and the existence of an irregular cartilage–bone interface with areas in which the cartilage deepens into the bone tissue without ever detaching in the form of an island ((<b>C</b>–<b>F</b>) arrows). (<b>A</b>,<b>C</b>,<b>E</b>) 10× magnification; (<b>B</b>,<b>D</b>,<b>F</b>) 20× magnification.</p>
Full article ">Figure 3
<p>Panel of Masson staining showing expansion processes of the hypertrophying layer in bone tissue ((<b>A</b>) white arrows) or infiltration processes of chondrocytes in bone tissue ((<b>B</b>) yellow asterisk). It is also possible to observe a double osteochondral border. Magnifications: 10× (<b>A</b>); 20× (<b>B</b>).</p>
Full article ">Figure 4
<p>Compound panel of immunofluorescence single localization reactions for MMP-9 (<b>A</b>,<b>B</b>) and MMP-2 ((<b>C</b>,<b>D</b>) green channel). It is possible to observe that the MMP-9 staining pattern is more intense in layers 1, 2, and 3 and decreases in layer 4; the same results are observed for MMP-2. (<b>E</b>) graphic shows a more intense fluorescence pattern of MMP-2 compared to MMP-9. Magnifications: 10× (<b>A</b>,<b>C</b>); 20× (<b>B</b>,<b>D</b>).</p>
Full article ">Figure 5
<p>Compound panel of immunofluorescence single localization reactions for osteocalcin ((<b>A</b>,<b>B</b>) green channel) and RANK ((<b>C</b>,<b>D</b>) red channel). It is possible to observe that the osteocalcin staining pattern is more intense in layer 4. The RANK staining pattern is well detectable above the cartilage–bone interface and shows osteoclasts within numerous Howship’s lacunae ((<b>C</b>,<b>D</b>) white arrows. Magnifications 10× (<b>A</b>–<b>D</b>).</p>
Full article ">Figure 6
<p>Compound panel of immunofluorescence single localization reactions for Collagen type I ((<b>A</b>,<b>B</b>) green channel) and Collagen type II ((<b>C</b>,<b>D</b>) green channel). It is possible to observe that collagen types I and II are expressed along the layers. Still, it is possible to observe a more intense fluorescence pattern for collagen type II compared to collagen type I, as shown by fluorescence intensity analysis (<b>E</b>). Magnifications: 20× (<b>A</b>–<b>D</b>). Transmitted light (<b>B</b>,<b>D</b>).</p>
Full article ">
16 pages, 12159 KiB  
Article
LGNMNet-RF: Micro-Expression Detection Using Motion History Images
by Matthew Kit Khinn Teng, Haibo Zhang and Takeshi Saitoh
Algorithms 2024, 17(11), 491; https://doi.org/10.3390/a17110491 - 1 Nov 2024
Viewed by 402
Abstract
Micro-expressions are very brief, involuntary facial expressions that reveal hidden emotions, lasting less than a second, while macro-expressions are more prolonged facial expressions that align with a person’s conscious emotions, typically lasting several seconds. Micro-expressions are difficult to detect in lengthy videos because [...] Read more.
Micro-expressions are very brief, involuntary facial expressions that reveal hidden emotions, lasting less than a second, while macro-expressions are more prolonged facial expressions that align with a person’s conscious emotions, typically lasting several seconds. Micro-expressions are difficult to detect in lengthy videos because they have tiny amplitudes, short durations, and frequently coexist alongside macro-expressions. Nevertheless, micro- and macro-expression analysis has sparked interest in researchers. Existing methods use optical flow features to capture the temporal differences. However, these optical flow features are limited to two successive images only. To address this limitation, this paper proposes LGNMNet-RF, which integrates a Lite General Network with MagFace CNN and a Random Forest classifier to predict micro-expression intervals. Our approach leverages Motion History Images (MHI) to capture temporal patterns across multiple frames, offering a more comprehensive representation of facial dynamics than optical flow-based methods, which are restricted to two successive frames. The novelty of our approach lies in the combination of MHI with MagFace CNN, which improves the discriminative power of facial micro-expression detection, and the use of a Random Forest classifier to enhance interval prediction accuracy. The evaluation results show that this method outperforms baseline techniques, achieving micro-expression F1-scores of 0.3019 on CAS(ME)2 and 0.3604 on SAMM-LV. The results of our experiment indicate that MHI offers a viable alternative to optical flow-based methods for micro-expression detection. Full article
(This article belongs to the Special Issue Supervised and Unsupervised Classification Algorithms (2nd Edition))
Show Figures

Figure 1

Figure 1
<p>Illustration of LGNMNet-RF Overall Architecture.</p>
Full article ">Figure 2
<p>Illustration of Detailed LGNMNet-RF Architecture.</p>
Full article ">Figure 3
<p>Examples of facial edge features detected using XDoG filter.</p>
Full article ">Figure 4
<p>Examples of MHI generated from XDoG edge detected images.</p>
Full article ">Figure 5
<p>Illustration of <math display="inline"><semantics> <msup> <mi mathvariant="script">R</mi> <mo>′</mo> </msup> </semantics></math> pseudo-labelling with temporal extension of <span class="html-italic">k</span> = 6.</p>
Full article ">
19 pages, 7937 KiB  
Article
Exploring the Benefits of Herbal Medicine Composite 5 (HRMC5) for Skin Health Enhancement
by Rira Ha, Won Kyong Cho, Euihyun Kim, Sung Joo Jang, Ju-Duck Kim, Chang-Geun Yi and Sang Hyun Moh
Curr. Issues Mol. Biol. 2024, 46(11), 12133-12151; https://doi.org/10.3390/cimb46110720 - 29 Oct 2024
Viewed by 459
Abstract
The skin, as the body’s largest organ, is vital for protecting against environmental stressors, regulating temperature, and preventing water loss. Here, we examined the potential of a mixture of five traditional Korean herbal extracts—Cimicifuga racemosa, Paeonia lactiflora, Phellodendron amurense, [...] Read more.
The skin, as the body’s largest organ, is vital for protecting against environmental stressors, regulating temperature, and preventing water loss. Here, we examined the potential of a mixture of five traditional Korean herbal extracts—Cimicifuga racemosa, Paeonia lactiflora, Phellodendron amurense, Rheum rhaponticum, and Scutellaria baicalensis—referred to as herbal medicine composite 5 (HRMC5) for enhancing skin health and managing menopausal symptoms. High-performance liquid chromatography identified 14 bioactive compounds, including flavonoids, phenolic acids, anthraquinones, and alkaloids. In vitro studies revealed an optimal concentration of 0.625 g/L for cell survival and UV protection, with the mixture demonstrating significant wound-healing properties comparable to epidermal growth factor. HRMC5 exhibited anti-inflammatory effects by downregulating COX2 expression and upregulating the key skin barrier proteins. A 4-week clinical trial involving 20 postmenopausal women showed significant improvements in skin redness, hemoglobin concentration, and skin moisture content. Visual analog scale assessments indicated substantial reductions in facial flushing severity and the associated sweating. The topical application of HRMC5 cream offered potential advantages over ingested phytoestrogens by reducing the systemic side effects. These findings suggest that HRMC5 is a promising non-invasive treatment for vasomotor symptoms in menopausal women and overall skin health, warranting further research on its long-term efficacy and safety in larger populations. Full article
Show Figures

Figure 1

Figure 1
<p>HPLC analysis of a mixture of five traditional Korean herbal medicine extracts, collectively referred to as HRMC5. (<b>A</b>) Chromatograms of the extract at different wavelengths (210, 254, 280, 330, 360, and 420 nm). (<b>B</b>) Overlay chromatogram of the extract, showing peaks corresponding to 14 identified compounds.</p>
Full article ">Figure 2
<p>Effects of HRMC5 on cell survival and UV protection. (<b>A</b>) Survival rate of cells treated with different concentrations of herbal medicine extract. (<b>B</b>) Survival rate of cells treated with herbal medicine extract after UV exposure. Sterile water was utilized as the control while absolute ethanol 1% was used as the positive control (PC). The statistical significance indicators (*, **, and ***) are based on the specified levels (0.05, 0.01, and 0.001), respectively.</p>
Full article ">Figure 3
<p>Effect of HRMC5 on wound healing. (<b>A</b>) Microscopic images showing the wound-healing process at different time points and under different treatment conditions. Control T0: untreated control at 0 h after scratch. Control T18: untreated control at 18 h after scratch. EGF: positive control treated with epidermal growth factor (100 ng/mL). HRMC5: experimental group treated with a mixture of five herbal medicine extract compounds (0.625 g/L). (<b>B</b>) The percentage of healed area after 18 h of culture for different treatment groups. Statistical significance was determined using ANOVA followed by Tukey’s post-hoc test. *** <span class="html-italic">p</span> &lt; 0.001 compared to control.</p>
Full article ">Figure 4
<p>Effect of HRMC5 on <span class="html-italic">COX2</span>, <span class="html-italic">Filaggrin</span>, and <span class="html-italic">Claudin 1</span> gene expression. (<b>A</b>) The relative expression of <span class="html-italic">COX2</span>, an inflammatory marker, in various treatment groups. The control group received no treatment, while the UV group was exposed to ultraviolet radiation only. The positive control group was treated with dexamethasone (DEX), a known anti-inflammatory agent. The experimental group was treated with HRMC5. (<b>B</b>) The relative expression of <span class="html-italic">Filaggrin</span> encoding a skin barrier protein, in different treatment groups. The control group received no treatment, while the experimental group was treated with HRMC5. (<b>C</b>) The relative expression of <span class="html-italic">Claudin 1</span> encoding another skin barrier protein, in various treatment groups. The control group received no treatment, while the experimental group was treated with HRMC5. Statistical significance was determined using ANOVA followed by Tukey’s post-hoc test. *** <span class="html-italic">p</span> &lt; 0.001 compared to control.</p>
Full article ">Figure 5
<p>Effect of HRMC5 on skin barrier proteins. (<b>A</b>–<b>D</b>) Left panels: representative immunofluorescence staining images of (<b>A</b>) Involucrin, (<b>B</b>) Filaggrin, (<b>C</b>) Claudin 1 (all green), and (<b>D</b>) Collagen Type 1 (red) in keratinocytes treated with different concentrations of C5 or positive control (1% Glyceryl glucoside). DAPI (blue) stains cell nuclei. Right panels: quantification of protein expression relative to the control group for (<b>A</b>) Involucrin, (<b>B</b>) Filaggrin, (<b>C</b>) Claudin 1, and (<b>D</b>) Collagen Type 1. Statistical significance was determined using ANOVA followed by Tukey’s post-hoc test. Different letters indicated significant difference between groups (<span class="html-italic">p</span> &lt; 0.05).</p>
Full article ">Figure 6
<p>Effects of HRMC5-containing cream on various skin parameters over 4 weeks of use. HRMC5-containing cream was applied to participants, and individual skin parameters were examined at baseline, 2 weeks, and 4 weeks after initiating product use. (<b>A</b>) Quantitative analysis of skin redness reduction over time. (<b>B</b>) Representative images demonstrating visible reduction in skin redness at baseline, 2 weeks, and 4 weeks. (<b>C</b>) Graph showing the decrease in hemoglobin concentration in the skin over the study period. (<b>D</b>) Representative images illustrating the reduction in hemoglobin concentration at baseline, 2 weeks, and 4 weeks. (<b>E</b>) Quantitative assessment of hemoglobin distribution evenness improvement. (<b>F</b>) Representative images showing the progression of hemoglobin distribution evenness at baseline, 2 weeks, and 4 weeks. (<b>G</b>) Graph depicting the increase in skin moisture content over time. (<b>H</b>) Representative images illustrating improved skin hydration at baseline, 2 weeks, and 4 weeks. (<b>I</b>) Quantitative analysis of facial flushing severity reduction throughout the study period. (<b>J</b>) Graph showing the decrease in sweating intensity over the course of the study. All data are presented as mean ± standard deviation (n = 20). * <span class="html-italic">p</span> &lt; 0.05, ** <span class="html-italic">p</span> &lt; 0.01, and *** <span class="html-italic">p</span> &lt; 0.001 compared to the baseline measurements (before product use).</p>
Full article ">
24 pages, 1556 KiB  
Review
Audio-Driven Facial Animation with Deep Learning: A Survey
by Diqiong Jiang, Jian Chang, Lihua You, Shaojun Bian, Robert Kosk and Greg Maguire
Information 2024, 15(11), 675; https://doi.org/10.3390/info15110675 - 28 Oct 2024
Viewed by 856
Abstract
Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on [...] Read more.
Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology. Full article
(This article belongs to the Special Issue Deep Learning for Image, Video and Signal Processing)
Show Figures

Figure 1

Figure 1
<p>The development of audio-driven animation.</p>
Full article ">Figure 2
<p>Scope of this survey.</p>
Full article ">Figure 3
<p>Graphical illustration of deep learning-based generation of audio-driven 2D video and 3D facial animation.</p>
Full article ">Figure 4
<p>Graphical illustration of landmark-based methods.</p>
Full article ">
20 pages, 5352 KiB  
Article
Facial Expression Recognition-You Only Look Once-Neighborhood Coordinate Attention Mamba: Facial Expression Detection and Classification Based on Neighbor and Coordinates Attention Mechanism
by Cheng Peng, Mingqi Sun, Kun Zou, Bowen Zhang, Genan Dai and Ah Chung Tsoi
Sensors 2024, 24(21), 6912; https://doi.org/10.3390/s24216912 - 28 Oct 2024
Viewed by 485
Abstract
In studying the joint object detection and classification problem for facial expression recognition (FER) deploying the YOLOX framework, we introduce a novel feature extractor, called neighborhood coordinate attention Mamba (NCAMamba) to substitute for the original feature extractor in the Feature Pyramid Network (FPN). [...] Read more.
In studying the joint object detection and classification problem for facial expression recognition (FER) deploying the YOLOX framework, we introduce a novel feature extractor, called neighborhood coordinate attention Mamba (NCAMamba) to substitute for the original feature extractor in the Feature Pyramid Network (FPN). NCAMamba combines the background information reduction capabilities of Mamba, the local neighborhood relationship understanding of neighborhood attention, and the directional relationship understanding of coordinate attention. The resulting FER-YOLO-NCAMamba model, when applied to two unaligned FER benchmark datasets, RAF-DB and SFEW, obtains significantly improved mean average precision (mAP) scores when compared with those obtained by other state-of-the-art methods. Moreover, in ablation studies, it is found that the NCA module is relatively more important than the Visual State Space (VSS), a version of using Mamba for image processing, and in visualization studies using the grad-CAM method, it reveals that regions around the nose tip are critical to recognizing the expression; if it is too large, it may lead to erroneous prediction, while a small focused region would lead to correct recognition; this may explain why FER of unaligned faces is such a challenging problem. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>Illustration of VSS (Visual State Space) Block.</p>
Full article ">Figure 2
<p>Illustration of Neighborhood Coordinate Attention module.</p>
Full article ">Figure 3
<p>Illustration of NCAMamba architecture.</p>
Full article ">Figure 4
<p>Illustration of Overall Architecture.</p>
Full article ">Figure 5
<p>Detection results and corresponding heatmaps on the RAF-DB dataset, where the detection is correct.</p>
Full article ">Figure 6
<p>Detection results and corresponding heatmaps on the SFEW dataset, where the detection is correct.</p>
Full article ">Figure 7
<p>Detection results and corresponding heatmaps on the RAF-DB dataset, where the detection is incorrect.</p>
Full article ">Figure 8
<p>Detection results and corresponding heatmaps on the SFEW dataset, where the detection is incorrect.</p>
Full article ">
18 pages, 3230 KiB  
Article
Autism Identification Based on the Intelligent Analysis of Facial Behaviors: An Approach Combining Coarse- and Fine-Grained Analysis
by Jingying Chen, Chang Chen, Ruyi Xu and Leyuan Liu
Children 2024, 11(11), 1306; https://doi.org/10.3390/children11111306 - 28 Oct 2024
Viewed by 454
Abstract
Background: Facial behavior has emerged as a crucial biomarker for autism identification. However, heterogeneity among individuals with autism poses a significant obstacle to traditional feature extraction methods, which often lack the necessary discriminative power. While deep-learning methods hold promise, they are often criticized [...] Read more.
Background: Facial behavior has emerged as a crucial biomarker for autism identification. However, heterogeneity among individuals with autism poses a significant obstacle to traditional feature extraction methods, which often lack the necessary discriminative power. While deep-learning methods hold promise, they are often criticized for their lack of interpretability. Methods: To address these challenges, we developed an innovative facial behavior characterization model that integrates coarse- and fine-grained analyses for intelligent autism identification. The coarse-grained analysis provides a holistic view by computing statistical measures related to facial behavior characteristics. In contrast, the fine-grained component uncovers subtle temporal fluctuations by employing a long short-term memory (LSTM) model to capture the temporal dynamics of head pose, facial expression intensity, and expression types. To fully harness the strengths of both analyses, we implemented a feature-level attention mechanism. This not only enhances the model’s interpretability but also provides valuable insights by highlighting the most influential features through attention weights. Results: Upon evaluation using three-fold cross-validation on a self-constructed autism dataset, our integrated approach achieved an average recognition accuracy of 88.74%, surpassing the standalone coarse-grained analysis by 8.49%. Conclusions: This experimental result underscores the improved generalizability of facial behavior features and effectively mitigates the complexities stemming from the pronounced intragroup variability of those with autism, thereby contributing to more accurate and interpretable autism identification. Full article
Show Figures

Figure 1

Figure 1
<p>A combined coarse- and fine-grained facial behavior characterization model.</p>
Full article ">Figure 2
<p>Fine-grained analysis.</p>
Full article ">Figure 3
<p>Feature fusion analysis with SENet.</p>
Full article ">Figure 4
<p>Photograph of the experimental data collection setup during the research.</p>
Full article ">Figure 5
<p>Head pose (yaw) of some of the autistic children.</p>
Full article ">Figure 6
<p>Changes in expression intensity of some typically developing children.</p>
Full article ">
Back to TopTop