Deep disturbance-disentangled learning for facial expression recognition
Proceedings of the 28th ACM International Conference on Multimedia, 2020•dl.acm.org
To achieve effective facial expression recognition (FER), it is of great importance to address
various disturbing factors, including pose, illumination, identity, and so on. However, a
number of FER databases merely provide the labels of facial expression, identity, and pose,
but lack the label information for other disturbing factors. As a result, many methods are only
able to cope with one or two disturbing factors, ignoring the heavy entanglement between
facial expression and multiple disturbing factors. In this paper, we propose a novel Deep …
various disturbing factors, including pose, illumination, identity, and so on. However, a
number of FER databases merely provide the labels of facial expression, identity, and pose,
but lack the label information for other disturbing factors. As a result, many methods are only
able to cope with one or two disturbing factors, ignoring the heavy entanglement between
facial expression and multiple disturbing factors. In this paper, we propose a novel Deep …
To achieve effective facial expression recognition (FER), it is of great importance to address various disturbing factors, including pose, illumination, identity, and so on. However, a number of FER databases merely provide the labels of facial expression, identity, and pose, but lack the label information for other disturbing factors. As a result, many methods are only able to cope with one or two disturbing factors, ignoring the heavy entanglement between facial expression and multiple disturbing factors. In this paper, we propose a novel Deep Disturbance-disentangled Learning (DDL) method for FER. DDL is capable of simultaneously and explicitly disentangling multiple disturbing factors by taking advantage of multi-task learning and adversarial transfer learning. The training of DDL involves two stages. First, a Disturbance Feature Extraction Model (DFEM) is pre-trained to perform multi-task learning for classifying multiple disturbing factors on the large-scale face database (which has the label information for various disturbing factors). Second, a Disturbance-Disentangled Model (DDM), which contains a global shared sub-network and two task-specific (i.e., expression and disturbance) sub-networks, is learned to encode the disturbance-disentangled information for expression recognition. The expression sub-network adopts a multi-level attention mechanism to extract expression-specific features, while the disturbance sub-network leverages adversarial transfer learning to extract disturbance-specific features based on the pre-trained DFEM. Experimental results on both the in-the-lab FER databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild FER databases (including RAF-DB and SFEW) demonstrate the superiority of our proposed method compared with several state-of-the-art methods.
ACM Digital Library