Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Knowledge Conditioned Variational Learning for One-Class Facial Expression Recognition

Published: 01 January 2023 Publication History

Abstract

The openness of application scenarios and the difficulties of data collection make it impossible to prepare all kinds of expressions for training. Hence, detecting expression absent during the training (called alien expression) is important to enhance the robustness of the recognition system. So in this paper, we propose a facial expression recognition (FER) model, named OneExpressNet, to quantify the probability that a test expression sample belongs to the distribution of training data. The proposed model is based on variational auto-encoder and enjoys several merits. First, different from conventional one class classification protocol, OneExpressNet transfers the useful knowledge from the related domain as a constraint condition of the target distribution. By doing so, OneExpressNet will pay more attention to the descriptive region for FER. Second, features from both source and target tasks will aggregate after constructing a skip connection between the encoder and decoder. Finally, to further separate alien expression from training expression, empirical compact variation loss is jointly optimized, so that training expression will concentrate on the compact manifold of feature space. The experimental results show that our method can achieve state-of-the-art results in one class facial expression recognition on small-scale lab-controlled datasets including CFEE and KDEF, and large-scale in-the-wild datasets including RAF-DB and ExpW.

References

[1]
Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang, “A survey of affect recognition methods: Audio, visual, and spontaneous expressions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 1, pp. 39–58, Jan. 2009.
[2]
Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 499–515.
[3]
L. Zhang and D. Tjondronegoro, “Facial expression recognition using facial movement features,” IEEE Trans. Affect. Comput., vol. 2, no. 4, pp. 219–229, Oct. 2011.
[4]
H. Xi, D. Aussel, W. Liu, S. T. Waller, and D. Rey, “Single-leader multi-follower games for the regulation of two-sided mobility-as-a-service markets,” Eur. J. Oper. Res., Jun. 2022. 10.1016/j.ejor.2022.06.041.
[5]
C. Chen, L. Tang, F. Liu, G. Zhao, Y. Huang, and Y. Yu, “Mix and reason: Reasoning over semantic topology with data mixing for domain generalization,” in Proc. 36th Conf. Neural Inf. Process. Syst., 2022, pp. 33302–33315.
[6]
C. Chen, Z. Zheng, Y. Huang, X. Ding, and Y. Yu, “I3Net: Implicit instance-invariant network for adapting one-stage object detectors,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 12576–12585.
[7]
M. A. Nicolaou, H. Gunes, and M. Pantic, “Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space,” IEEE Trans. Affect. Comput., vol. 2, no. 2, pp. 92–105, Apr. 2011.
[8]
J. Zhu, B. Luo, S. Zhao, S. Ying, X. Zhao, and Y. Gao, “IExpressNet: Facial expression recognition with incremental classes,” in Proc. 28th ACM Int. Conf. Multimedia, Oct. 2020, pp. 2899–2908.
[9]
W.-Y. Chang, S.-H. Hsu, and J.-H. Chien, “FATAUVA-Net: An integrated deep learning framework for facial attribute recognition, action unit detection, and valence-arousal estimation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2017, pp. 17–25.
[10]
J. J. Lien, T. Kanade, J. F. Cohn, and C.-C. Li, “Automated facial expression recognition based on FACS action units,” in Proc. 3rd IEEE Int. Conf. Autom. Face Gesture Recognit., Apr. 1998, pp. 390–395.
[11]
L. Li, T. Baltrusaitis, B. Sun, and L.-P. Morency, “Combining sequential geometry and texture features for distinguishing genuine and deceptive emotions,” in Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCVW), Oct. 2017, pp. 3147–3153.
[12]
D. Simon, K. D. Craig, F. Gosselin, P. Belin, and P. Rainville, “Recognition and discrimination of prototypical dynamic expressions of pain and emotions,” Pain, vol. 135, no. 1, pp. 55–64, 2008.
[13]
D. Kollias, M. A. Nicolaou, I. Kotsia, G. Zhao, and S. Zafeiriou, “Recognition of affect in the wild using deep neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Jul. 2017, pp. 26–33.
[14]
J. Lee and E. Park, “Fuzzy similarity-based emotional classification of color images,” IEEE Trans. Multimedia, vol. 13, no. 5, pp. 1031–1039, Oct. 2011.
[15]
C. Padgett, G. W. Cottrell, and R. Adolphs, “Categorical perception in facial emotion classification,” in Proc. Annu. Conf. Cogn. Sci. Soc., vol. 18. San Diego, CA, USA: Univ. of California, Jul. 1996, p. 249.
[16]
J. Zeng, S. Shan, and X. Chen, “Facial expression recognition with inconsistently annotated datasets,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 222–237.
[17]
M. Pantic and L. J. M. Rothkrantz, “Automatic analysis of facial expressions: The state of the art,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1424–1445, Dec. 2000.
[18]
E. Sariyanidi, H. Gunes, and A. Cavallaro, “Automatic analysis of facial affect: A survey of registration, representation, and recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1113–1133, Jun. 2015.
[19]
S. Li and W. Deng, “Deep facial expression recognition: A survey,” IEEE Trans. Affect. Comput., vol. 13, no. 3, pp. 1195–1215, Jul./Sep. 2022.
[20]
X. Benet al., “Video-based facial micro-expression analysis: A survey of datasets, features and algorithms,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 9, pp. 5826–5846, Sep. 2022.
[21]
C. A. Corneanu, M. O. Simón, J. F. Cohn, and S. E. Guerrero, “Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 8, pp. 1548–1568, Aug. 2016.
[22]
Y.-I. Tian, T. Kanade, and J. F. Cohn, “Recognizing action units for facial expression analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 2, pp. 97–115, Feb. 2001.
[23]
M. J. Black and Y. Yacoob, “Recognizing facial expressions in image sequences using local parameterized models of image motion,” Int. J. Comput. Vis., vol. 25, no. 1, pp. 23–48, 1997.
[24]
G. Zhao and M. Pietikainen, “Dynamic texture recognition using local binary patterns with an application to facial expressions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 915–928, Jun. 2007.
[25]
H. Xi, “Data-driven optimization technologies for MaaS,” in Big Data and Mobility as a Service. Amsterdam, The Netherlands: Elsevier, 2022, pp. 143–176.
[26]
Z. Fan, T. Chen, P. Wang, and Z. Wang, “CADTransformer: Panoptic symbol spotting transformer for CAD drawings,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 10986–10996.
[27]
C. Chenet al., “Relation matters: Foreground-aware graph-based relational reasoning for domain adaptive object detection,” 2022, arXiv:2206.02355.
[28]
Z. Fan, L. Sun, X. Ding, Y. Huang, C. Cai, and J. Paisley, “A segmentation-aware deep fusion network for compressed sensing MRI,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 55–70.
[29]
S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 2852–2861.
[30]
F. Zhang, T. Zhang, Q. Mao, and C. Xu, “Joint pose and expression modeling for facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 3359–3368.
[31]
Z. Fan, L. Zhu, H. Li, X. Chen, S. Zhu, and P. Tan, “FloorPlanCAD: A large-scale CAD drawing dataset for panoptic symbol spotting,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 10128–10137.
[32]
Z. Fan, H. Wu, X. Fu, Y. Huang, and X. Ding, “Residual-guide network for single image deraining,” in Proc. 26th ACM Int. Conf. Multimedia, Oct. 2018, pp. 1751–1759.
[33]
G. Pang, C. Shen, L. Cao, and A. van den Hengel, “Deep learning for anomaly detection: A review,” 2020, arXiv:2007.02500.
[34]
V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Comput. Surv., vol. 41, no. 3, pp. 1–58, Jul. 2009.
[35]
P. Perera, P. Oza, and V. M. Patel, “One-class classification: A survey,” 2021, arXiv:2101.03064.
[36]
M. A. F. Pimentel, D. A. Clifton, L. Clifton, and L. Tarassenko, “A review of novelty detection,” Signal Process., vol. 99, pp. 215–249, Jun. 2014.
[37]
B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high-dimensional distribution,” Microsoft Res., Redmond, WA, USA, Tech. Rep. MSR-T R-99–87, 1999.
[38]
D. M. J. Tax and R. P. W. Duin, “Support vector data description,” Mach. Learn., vol. 54, no. 1, pp. 45–66, Jan. 2004.
[39]
C. Chen, J. Li, Z. Zheng, Y. Huang, X. Ding, and Y. Yu, “Dual bipartite graph learning: A general approach for domain adaptive object detection,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 2703–2712.
[40]
H. Xi, L. He, Y. Zhang, and Z. Wang, “Differentiable road pricing for environment-oriented electric vehicle and gasoline vehicle users in the bi-objective transportation network,” Transp. Lett., vol. 14, no. 6, pp. 660–674, Jul. 2022.
[41]
L. Ruffet al., “Deep one-class classification,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 4393–4402.
[42]
W. Hu, M. Wang, Q. Qin, J. Ma, and B. Liu, “HRN: A holistic approach to one class learning,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 19111–19124.
[43]
P. Perera, R. Nallapati, and B. Xiang, “OCGAN: One-class novelty detection using GANs with constrained latent representations,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 2898–2906.
[44]
Y. Chen, Y. Tian, G. Pang, and G. Carneiro, “Deep one-class classification via interpolated Gaussian descriptor,” in Proc. AAAI Conf. Artif. Intell., 2022, vol. 36, no. 1, pp. 383–392.
[45]
D. P. Kingma, S. Mohamed, D. Jimenez Rezende, and M. Welling, “Semi-supervised learning with deep generative models,” in Proc. Adv. Neural Inf. Process. Syst., vol. 27, 2014, pp. 1–9.
[46]
K. Sohn, H. Lee, and X. Yan, “Learning structured output representation using deep conditional generative models,” in Proc. Adv. Neural Inf. Process. Syst., vol. 28, 2015, pp. 1–9.
[47]
M. Salehi, N. Sadjadi, S. Baselizadeh, M. H. Rohban, and H. R. Rabiee, “Multiresolution knowledge distillation for anomaly detection,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2021, pp. 14902–14912.
[48]
D. P. Kingma and M. Welling, “Auto-encoding variational Bayes,” in Proc. Int. Conf. Learn. Represent., 2014, pp. 1–14.
[49]
I. Tolstikhin, O. Bousquet, S. Gelly, and B. Schoelkopf, “Wasserstein auto-encoders,” 2017, arXiv:1711.01558.
[50]
S. Zhao, J. Song, and S. Ermon, “InfoVAE: Information maximizing variational autoencoders,” 2017, arXiv:1706.02262.
[51]
C. Lv, Z. Wu, X. Wang, and M. Zhou, “3D facial expression modeling based on facial landmarks in single image,” Neurocomputing, vol. 355, pp. 155–167, Aug. 2019.
[52]
N. P. Gopalan, S. Bellamkonda, and V. S. Chaitanya, “Facial expression recognition using geometric landmark points and convolutional neural networks,” in Proc. Int. Conf. Inventive Res. Comput. Appl. (ICIRCA), Jul. 2018, pp. 1149–1153.
[53]
K. Wang, X. Peng, J. Yang, S. Lu, and Y. Qiao, “Suppressing uncertainties for large-scale facial expression recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 6897–6906.
[54]
H. Zhao, J. Jia, and V. Koltun, “Exploring self-attention for image recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 10076–10085.
[55]
S. Du, Y. Tao, and A. M. Martinez, “Compound facial expressions of emotion,” Proc. Nat. Acad. Sci. USA, vol. 111, no. 15, pp. E1454–E1462, Apr. 2014.
[56]
D. Lundqvist, A. Flykt, and A. Öhman, “The Karolinska directed emotional faces (KDEF),” CD ROM Dept. Clin. Neurosci., Psychol. Sect., Karolinska Institutet, Solna, Sweden, 1998, p. 2, vol. 91, no. 630.
[57]
S. Li and W. Deng, “Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition,” IEEE Trans. Image Process., vol. 28, no. 1, pp. 356–370, Jan. 2019.
[58]
Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “From facial expression recognition to interpersonal relation prediction,” Int. J. Comput. Vis., vol. 126, no. 5, pp. 550–569, May 2018.
[59]
M. Z. Zaheer, J. Lee, M. Astrid, and S.-I. Lee, “Old is gold: Redefining the adversarially learned one-class classifier training paradigm,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 14183–14193.
[60]
A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2D & 3D face alignment problem? (And a dataset of 230,000 3D facial landmarks),” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 1021–1030.
[61]
Z. Wang, L. Liu, and D. Tao, “Deep streaming label learning,” in Proc. Int. Conf. Mach. Learn., 2020, pp. 9963–9972.
[62]
V. Bazarevsky, Y. Kartynnik, A. Vakunov, K. Raveendran, and M. Grundmann, “BlazeFace: Sub-millisecond neural face detection on mobile GPUs,” 2019, arXiv:1907.05047.
[63]
I. Borg and P. Groenen, “Modern multidimensional scaling: Theory and applications,” J. Educ. Meas., vol. 40, no. 3, pp. 277–280, 2003.
[64]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual explanations from deep networks via gradient-based localization,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 618–626.
[65]
P. Ekman and W. V. Friesen, “Facial action coding system (FACS): A technique for the measurement of facial actions,” Rivista Di Psichiatria, vol. 47, no. 2, pp. 126–138, 1978.
[66]
W. V. Friesen and P. Ekman, “EMFACS-7: Emotional facial action coding system,” Univ. California, 1983, p. 36, vol. 2.
[67]
C. Chenet al., “Progressive feature alignment for unsupervised domain adaptation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 627–636.
[68]
C. Chen, J. Li, X. Han, X. Liu, and Y. Yu, “Compound domain generalization via meta-knowledge encoding,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 7119–7129.
[69]
Z. Wang, L. Liu, Y. Duan, Y. Kong, and D. Tao, “Continual learning with lifelong vision transformer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2022, pp. 171–181.
[70]
Z. Wang, L. Liu, Y. Kong, J. Guo, and D. Tao, “Online continual learning with contrastive vision transformer,” 2022, arXiv:2207.13516.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing
IEEE Transactions on Image Processing  Volume 32, Issue
2023
5324 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2023

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media