Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICIP.2017.8296442guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Facial analysis in the wild with LSTM networks

Published: 17 September 2017 Publication History

Abstract

The promise of computer vision systems to efficiently and accurately recognize faces and facial variations in naturally occurring circumstances still remains elusive. In this paper we present two separate systems for face analysis, both of which use Long Short Term Memory (LSTM) Networks: unconstrained video-based face verification (FaceVideoModel) and spontaneous facial expression recognition (ExpModel). Since LSTM models have influential ability to capture sequential patterns, our results prove such LSTM models have significant advantages over other proposed models in the state-of-the-art for facial analysis in the wild. On the recently introduced Youtube Faces database our FaceModel achieves an accuracy of 98.70% for face verification with a value of 99.94% for the Area Under Curve (AUC) and 1.2% Equal Error Rate (EER) which is the best performance on this database compared to other recently proposed methods. Experimental results achieved through the proposed ExpModel on the challenging FER2013 dataset, including the CK+ database, also demonstrate the effectiveness of our deep model for facial expression recognition.

7. References

[1]
Gary B Huang, Honglak Lee, and Erik Learned-Miller, “Learning hierarchical representations for face verification with convolutional deep belief networks”, in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012, pp. 2518–2525.
[2]
Rajeev Ranjan, Vishal M Patel, and Rama Chellappa, “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition”, ar Xiv preprint arXiv:, 2016.
[3]
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadar-Rama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, “Long-term recurrent convolutional networks for visual recognition and description”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 2625–2634.
[4]
Florian Schroff, Dmitry Kalenichenko, and James Philbin, “Facenet: A unified embedding for face recognition and clustering”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 815–823.
[5]
Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman, “Deep face recognition”, in British Machine Vision Conference, 2015, vol. 1, p. 6.
[6]
Zhiding Yu and Cha Zhang, “Image based static facial expression recognition with multiple deep network learning”, in Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, 2015, pp. 435–442.
[7]
Yichuan Tang, “Deep learning using support vector machines”, CoRR, abs/1306.0239, vol. 2, 2013.
[8]
Chen Chen Zhu, Yutong Zheng, Khoa Luu, T. Hoang Ngan Le, Chandrasekhar Bhagavatula, and Marios Sav-Vides, “Weakly supervised facial analysis with dense hyper-column features”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 25–33.
[9]
Dong Chen, Xudong Cao, Liwei Wang, Fang Wen, and Jian Sun, “Bayesian face revisited: A joint formulation”, in European Conference on Computer Vision. Springer, 2012, pp. 566–579.
[10]
Lior Wolf, Tal Hassner, and Itay Maoz, “Face recognition in unconstrained videos with matched background similarity”, in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011, pp. 529–534.
[11]
Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang, “Deep learning face representation by joint identification-verification”, in Advances in neural information processing systems, 2014, pp. 1988–1996.
[12]
Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al., “Challenges in representation learning: A report on three machine learning contests”, in International Conference on Neural Information Processing. Springer, 2013, pp. 117–124.
[13]
Ali Mollahosseini, David Chan, and Mohammad H Ma-Hoor, “Going deeper in facial expression recognition using deep neural networks”, in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2016, pp. 1–10.
[14]
Xianlin Peng, Zhaoqiang Xia, Lei Li, and Xiaoyi Feng, “Towards facial expression recognition in the wild: A new database and deep recognition system”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 93–99.
[15]
Lior Wolf, Tal Hassner, and Yaniv Taigman, “Descrip-tor based methods in the wild”, in Workshop on faces in ‘real-life’ images: Detection, alignment, and recognition, 2008.
[16]
Junlin Hu, Jiwen Lu, Junsong Yuan, and Yap-Peng Tan, “Large margin multi-metric learning for face and kinship verification in the wild”, in Asian Conference on Computer Vision. Springer, 2014, pp. 252–267.
[17]
Junlin Hu, Jiwen Lu, and Yap-Peng Tan, “Discrimi-native deep metric learning for face verification in the wild”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1875–1882.
[18]
Haoxiang Li, Gang Hua, Xiaohui Shen, Zhe Lin, and Jonathan Brandt, “Eigen-pep for video face recognition”, in Asian Conference on Computer Vision. Springer, 2014, pp. 17–33.
[19]
Lacey Best-Rowden, Brendan Klare, Joshua Klontz, and Anil K Jain, “Video-to-video face matching: Establishing a baseline for unconstrained face recognition”, in Biometrics: Theory, Applications and Systems (BTAS), 2013 IEEE Sixth International Conference on. IEEE, 2013, pp. 1–8.
[20]
Yaniv Taigman, Ming Yang, Marc'aurelio Ranzato, and Lior Wolf, “Deepface: Closing the gap to human-level performance in face verification”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1701–1708.

Cited By

View all
  • (2020)Automatic Analysis of Facilitated Taste-likingCompanion Publication of the 2020 International Conference on Multimodal Interaction10.1145/3395035.3425645(292-300)Online publication date: 25-Oct-2020

Index Terms

  1. Facial analysis in the wild with LSTM networks
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Guide Proceedings
          2017 IEEE International Conference on Image Processing (ICIP)
          Sep 2017
          4869 pages

          Publisher

          IEEE Press

          Publication History

          Published: 17 September 2017

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 29 Nov 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2020)Automatic Analysis of Facilitated Taste-likingCompanion Publication of the 2020 International Conference on Multimodal Interaction10.1145/3395035.3425645(292-300)Online publication date: 25-Oct-2020

          View Options

          View options

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media