Abstract
The quality and size of training set have a great impact on the results of deep learning-based face-related tasks. However, collecting and labeling adequate samples with high-quality and balanced distributions still remains a laborious and expensive work, and various data augmentation techniques have thus been widely used to enrich the training dataset. In this paper, we review the existing works of face data augmentation from the perspectives of the transformation types and methods, with the state-of-the-art approaches involved. Among all these approaches, we put the emphasis on the deep learning-based works, especially the generative adversarial networks which have been recognized as more powerful and effective tools in recent years. We present their principles, discuss the results and show their applications as well as limitations. Different evaluation metrics for evaluating these approaches are also introduced. We point out the challenges and opportunities in the field of face data augmentation and provide brief yet insightful discussions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agianpuye S, Minoi JL (2013) 3d facial expression synthesis: a survey. In: 2013 8th international conference on information technology in Asia (CITA). IEEE, pp 1–7
Alashkar T, Jiang S, Wang S, Fu Y (2017) Examples-rules guided deep neural network for makeup recommendation. In: Thirty-first AAAI conference on artificial intelligence
Alhaija HA, Mustikovela SK, Mescheder L, Geiger A, Rother C (2017) Augmented reality meets deep learning for car instance segmentation in urban scenes. In: British machine vision conference, vol 1, p 2
Antipov G, Baccouche M, Dugelay JL (2017) Face aging with conditional generative adversarial networks. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 2089–2093
Antoniou A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. ArXiv preprint arXiv:1711.04340
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223
Azevedo P, Dos Santos TO, De Aguiar E (2016) An augmented reality virtual glasses try-on system. In: 2016 XVIII symposium on virtual and augmented reality (SVR). IEEE, pp 1–9
Banerjee S, Bernhard JS, Scheirer WJ, Bowyer KW, Flynn PJ (2017) Srefi: synthesis of realistic example face images. In: 2017 IEEE international joint conference on biometrics (IJCB). IEEE, pp 37–45
Banerjee S, Scheirer WJ, Bowyer KW, Flynn, PJ (2018) On hallucinating context and background pixels from a face mask using multi-scale gans. ArXiv preprint arXiv:1811.07104
Bao J, Chen D, Wen F, Li H, Hua G (2017) Cvae-gan: fine-grained image generation through asymmetric training. In: Proceedings of the IEEE international conference on computer vision, pp 2745–2754
Bao J, Chen D, Wen F, Li H, Hua G (2018) Towards open-set identity preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6713–6722
Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074
Blanz V, Vetter T et al (1999) A morphable model for the synthesis of 3d faces. Siggraph 99:187–194
Cao J, Hu Y, Yu B, He R, Sun Z (2018) Load balanced gans for multi-view face image synthesis. ArXiv preprint arXiv:1802.07447
Cao J, Hu Y, Zhang H, He R, Sun Z (2018) Learning a high fidelity pose invariant model for high-resolution face frontalization. In: Advances in neural information processing systems, pp 2872–2882
Chang H, Lu J, Yu F, Finkelstein A (2018) Pairedcyclegan: asymmetric style transfer for applying and removing makeup. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 40–48
Chen W, Xie X, Jia X, Shen L (2018) Texture deformation based generative adversarial networks for face editing. ArXiv preprint arXiv:1812.09832
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp 2172–2180
Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 6:681–685
Crispell D, Biris O, Crosswhite N, Byrne J, Mundy JL (2017) Dataset augmentation for pose and lighting invariant face recognition. ArXiv preprint arXiv:1704.04326
Cubuk E.D, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 113–123
Das D, Lee CSG (2018) Graph matching and pseudo-label guided deep unsupervised domain adaptation. In: International conference on artificial neural networks, pp 342–352
Das D, Lee CSG (2018) Sample-to-sample correspondence for unsupervised domain adaptation. Eng Appl Artif Intell 73:80–91
Das D, Lee CSG (2018) Unsupervised domain adaptation using regularized hyper-graph matching. In: Computer vision and pattern recognition
Das D, Lee CSG (2019) Zero-shot image recognition using relational matching, adaptation and calibration. In: Computer vision and pattern recognition
Deng J, Cheng S, Xue N, Zhou Y, Zafeiriou S (2018) Uv-gan: Adversarial facial uv map completion for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7093–7102
Di X, Sindagi VA, Patel VM (2018) Gp-gan: Gender preserving gan for synthesizing faces from landmarks. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 1079–1084
Ding H, Sricharan K, Chellappa R (2018) Exprgan: facial expression editing with controllable expression intensity. In: Thirty-second AAAI conference on artificial intelligence
Dinh L, Krueger D, Bengio Y (2014) Nice: non-linear independent components estimation. ArXiv preprint arXiv:1410.8516
Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766
Faceapp: Transform your face. https://www.faceapp.com/ (2018)
Feng ZH, Hu G, Kittler J, Christmas W, Wu XJ (2015) Cascaded collaborative regression for robust facial landmark detection trained using a mixture of synthetic and real images with dynamic weighting. IEEE Trans Image Process 24(11):3425–3440
Feng ZH, Kittler J, Christmas W, Huber P, Wu XJ (2017) Dynamic attention-controlled cascaded shape regression exploiting training data augmentation and fuzzy-set sample weighting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2481–2490
Flynn M (2016) Generating faces with deconvolution networks. https://zo7.github.io/blog/2016/09/25/generating-faces.html
Gecer B, Bhattarai B, Kittler J, Kim TK (2018) Semi-supervised adversarial learning to generate photorealistic face images of new identities from 3d morphable model. In: Proceedings of the European conference on computer vision (ECCV), pp 217–234
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Grover A, Dhar M, Ermon S (2018) Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In: Thirty-second AAAI conference on artificial intelligence
Gu G, Kim ST, Kim K, Baddar WJ, Ro YM (2017) Differential generative adversarial networks: synthesizing non-linear facial variations with limited number of training data. ArXiv preprint arXiv:1711.10267
Guan S (2018) Tl-gan: transparent latent-space gan. https://github.com/SummitKwan/transparent_latent_gan
Gulrajani I, Kumar K, Ahmed F, Taiga AA, Visin F, Vazquez D, Courville A (2016) Pixelvae: a latent variable model for natural images. ArXiv preprint arXiv:1611.05013
Guo D, Sim T (2009) Digital face makeup by example. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 73–79
Guo J, Zhu X, Lei Z, Li SZ (2018) Face synthesis for eyeglass-robust face recognition. In: Chinese conference on biometric recognition. Springer, pp 275–284
Guo Y, Cai J, Jiang B, Zheng J et al (2018) Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans Pattern Anal Mach Intell 41(6):1294–1307
Hassner T, Harel S, Paz E, Enbar R (2015) Effective face frontalization in unconstrained images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4295–4304
He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: facial attribute editing by only changing what you want. IEEE Trans Image Process 28(11):5464–5478
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hong S, Im W, Ryu J, Yang HS (2017) Sspp-dan: Deep domain adaptation network for face recognition with single sample per person. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 825–829
Hu G, Peng X, Yang Y, Hospedales TM, Verbeek J (2018) Frankenstein: learning deep face representations using small data. IEEE Trans Image Process 27(1):293–303
Hu G, Yan F, Chan C.H, Deng W, Christmas W, Kittler J, Robertson NM (2016) Face recognition using a unified 3d morphable model. In: European conference on computer vision. Springer, pp 73–89
Hu Y, Wu X, Yu B, He R, Sun Z (2018) Pose-guided photorealistic face rotation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8398–8406
Huang H, He R, Sun Z, Tan T et al (2018) Introvae: introspective variational autoencoders for photographic image synthesis. In: Advances in neural information processing systems, pp 52–63
Huang H, Yu PS, Wang C (2018) An introduction to image synthesis with generative adversarial nets. ArXiv preprint arXiv:1803.04469
Huang R, Zhang S, Li T, He R (2017) Beyond face rotation: global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: Proceedings of the IEEE international conference on computer vision, pp 2439–2448
Huber P, Hu G, Tena R, Mortazavian P, Koppen P, Christmas WJ, Ratsch M, Kittler J (2016) A multiresolution 3d morphable face model and fitting framework. In: Proceedings of the 11th international joint conference on computer vision, imaging and computer graphics theory and applications
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Javornik A, Rogers Y, Moutinho AM, Freeman R (2016) Revealing the shopper experience of using a“ magic mirror” augmented reality make-up application. In: Conference on designing interactive systems, vol 2016. Association for Computing Machinery (ACM), pp 871–882
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711
Juefei-Xu F, Dey R, Bodetti V, Savvides M (2018) Rankgan: a maximum margin ranking gan for generating faces. In: Proceedings of the Asian conference on computer vision (ACCV), vol 4
Jung A (2017) imgaug. https://github.com/aleju/imgaug
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. In: ICLR
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Kemelmacher-Shlizerman I (2016) Transfiguring portraits. ACM Trans Graph (TOG) 35(4):94
Kemelmacher-Shlizerman I, Suwajanakorn S, Seitz SM (2014) Illumination-aware age progression. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3334–3341
Kim D, Hernandez M, Choi J, Medioni G (2017) Deep 3d face identification. In: 2017 IEEE international joint conference on biometrics (IJCB). IEEE, pp 133–142
Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, volume 70, pp 1857–1865. JMLR.org
Kim T, Kim B, Cha M, Kim J (2017) Unsupervised visual attribute transfer with reconfigurable generative adversarial networks. ArXiv preprint arXiv:1707.09798
Kingma DP, Dhariwal P (2018) Glow: Generative flow with invertible 1x1 convolutions. In: Advances in neural information processing systems, pp 10236–10245
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: Proceedings of the 2nd international conference on learning representations (ICLR)
Kitanovski V, Izquierdo E (2011) Augmented reality mirror for virtual facial alterations. In: 2011 18th IEEE international conference on image processing. IEEE, pp 1093–1096
Kortylewski A, Schneider A, Gerig T, Egger B, Morel-Forster A, Vetter T (2018) Training deep face recognition systems with synthetic data. ArXiv preprint arXiv:1802.05891
Kossaifi J, Tran L, Panagakis Y, Pantic M (2018) Gagan: geometry-aware generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 878–887
Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J (2015) Deep convolutional inverse graphics network. In: Advances in neural information processing systems, pp 2539–2547
Lample G, Zeghidour N, Usunier N, Bordes A, Denoyer L et al (2017) Fader networks: manipulating images by sliding attributes. In: Advances in neural information processing systems, pp 5967–5976
Larsen ABL, Sønderby SK, Larochelle H, Winther O (2015) Autoencoding beyond pixels using a learned similarity metric. In: 33rd International conference on machine learning
Lee JY, Kang HB (2016) A new digital face makeup method. In: 2016 IEEE international conference on consumer electronics (ICCE). IEEE, pp 129–130
Lemley J, Bazrafkan S, Corcoran P (2017) Smart augmentation learning an optimal data augmentation strategy. IEEE Access 5:5858–5869
Leng B, Yu K, Jingyan Q (2017) Data augmentation for unbalanced face recognition training sets. Neurocomputing 235:10–14
Li L, Peng Y, Qiu G, Sun Z, Liu S (2018) A survey of virtual sample generation technology for face recognition. Artif Intell Rev 50(1):1–20
Li M, Zuo W, Zhang D (2016) Deep identity-aware transfer of facial attributes. ArXiv preprint arXiv:1610.05586
Li P, Hu Y, He R, Sun Z (2019) Global and local consistent wavelet-domain age synthesis. IEEE Trans Inf Forensics Secur 14(11):2943–2957
Li T, Qian R, Dong C, Liu S, Yan Q, Zhu W, Lin L (2018) Beautygan: instance-level facial makeup transfer with deep generative adversarial network. In: 2018 ACM multimedia conference on multimedia conference. ACM, pp 645–653
Liu B, Wang X, Dixit M, Kwitt R, Vasconcelos N (2018) Feature space transfer for data augmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9090–9098
Liu F, Zhu R, Zeng D, Zhao Q, Liu X (2018) Disentangling features in 3d face shapes for joint face reconstruction and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5216–5225
Liu L, Xing J, Liu S, Xu H, Zhou X, Yan S (2014) Wow! you are so beautiful today!. ACM Trans Multimed Comput Commun Appl (TOMM) 11(1s):20
Liu L, Zhang H, Ji Y, Wu QJ (2019) Toward ai fashion design: an attribute-gan model for clothing match. Neurocomputing 341:156–167
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, pp 700–708
Liu MY, Tuzel O (2016) Coupled generative adversarial networks. In: Advances in neural information processing systems, pp 469–477
Liu S, Ou X, Qian R, Wang W, Cao X (2016) Makeup like a superstar: deep localized makeup transfer network. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI’16. AAAI Press, pp 2568–2575
Liu Y, Li Q, Sun Z (2019) Attribute enhanced face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3730–3738
Lu Y, Tai YW, Tang CK (2018) Attribute-guided face generation using conditional cyclegan. In: Proceedings of the European conference on computer vision (ECCV), pp 282–297
Lv JJ, Cheng C, Tian GD, Zhou XD, Zhou X (2016) Landmark perturbation-based data augmentation for unconstrained face recognition. Signal Process Image Commun 47:465–475
Lv JJ, Shao XH, Huang JS, Zhou XD, Zhou X (2017) Data augmentation for face recognition. Neurocomputing 230:184–196
Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B (2015) Adversarial autoencoders. ArXiv preprint arXiv:1511.05644
Mash R, Borghetti B, Pecarina J (2016) Improved aircraft recognition for aerial refueling through data augmentation in convolutional neural networks. In: International symposium on visual computing. Springer, pp 113–122
Masi I, Rawls S, Medioni G, Natarajan P (2016) Pose-aware face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4838–4846
Masi I, Trãn AT, Hassner T, Leksut JT, Medioni G (2016) Do we really need to collect millions of faces for effective face recognition? In: European conference on computer vision. Springer, pp 579–596
Matthews I, Xiao J, Baker S (2007) 2d vs. 3d deformable face models: representational power, construction, and real-time fitting. Int J Comput Vision 75(1):93–113
Menze M, Geiger, A (2015) Object scene flow for autonomous vehicles. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3070
Mirza M, Osindero, S (2014) Conditional generative adversarial nets. ArXiv preprint arXiv:1411.1784
Moniz JRA, Beckham C, Rajotte S, Honari S, Pal C (2018) Unsupervised depth estimation, 3d face rotation and replacement. In: Advances in neural information processing systems, pp 9759–9769
Nguyen TV, Liu L (2017) Smart mirror: Intelligent makeup recommendation and synthesis. In: Proceedings of the 25th ACM international conference on multimedia. ACM, pp 1253–1254
Oo WY (2016) Digital makeup face generation. https://web.stanford.edu/class/ee368/Project_Autumn_1516/Reports/Oo.pdf
Palsson S, Agustsson E, Timofte R, Van Gool L (2018) Generative adversarial style transfer networks for face aging. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2084–2092
Pandey G, Dukkipati A (2017) Variational methods for conditional multimodal deep learning. In: 2017 international joint conference on neural networks (IJCNN). IEEE, pp 308–315
Parkhi OM, Vedaldi A, Zisserman A et al (2015) Deep face recognition. In: The British machine vision conference (BMVC), vol 1, p 6
Perarnau G, Van De Weijer J, Raducanu B, Álvarez JM (2016) Invertible conditional gans for image editing. In: NIPS 2016 workshop on adversarial training
Pham HX, Wang Y, Pavlovic V (2018) Generative adversarial talking head: Bringing portraits to life with a weakly supervised neural network. ArXiv preprint arXiv:1803.07716
Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2018) Ganimation: anatomically-aware facial animation from a single image. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833
Qiao F, Yao N, Jiao Z, Li Z, Chen H, Wang H (2018) Geometry-contrastive gan for facial expression transfer. ArXiv preprint arXiv:1802.01822
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. ArXiv preprint arXiv:1511.06434
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
Salimans T, Karpathy A, Chen X, Kingma DP (2017) Pixelcnn++: a pixelcnn implementation with discretized logistic mixture likelihood and other modifications. In: ICLR
Sanchez E, Valstar M (2018) Triple consistency loss for pairing distributions in gan-based face synthesis. ArXiv preprint arXiv:1811.03492
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Shen W, Liu R (2017) Learning residual images for face attribute manipulation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4030–4038
Shen Y, Luo P, Yan J, Wang X, Tang X (2018) Faceid-gan: learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830
Shen Y, Zhou B, Luo P, Tang X (2018) Facefeat-gan: a two-stage approach for identity-preserving face synthesis. ArXiv preprint arXiv:1812.01288
Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R (2017) Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2107–2116
Shu X, Tang J, Lai H, Liu L, Yan S (2015) Personalized age progression with aging dictionary. In: Proceedings of the IEEE international conference on computer vision, pp 3970–3978
Shu Z, Sahasrabudhe M, Alp Guler R, Samaras D, Paragios N, Kokkinos I (2018) Deforming autoencoders: unsupervised disentangling of shape and appearance. In: Proceedings of the European conference on computer vision (ECCV), pp 650–665
Shu Z, Yumer E, Hadap S, Sunkavalli K, Shechtman E, Samaras D (2017) Neural face editing with intrinsic image disentangling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5541–5550
Sixt L, Wild B, Landgraf T (2018) Rendergan: generating realistic labeled data. Front Robot AI 5:66
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems, pp 3483–3491
Song J, Zhang J, Gao L, Liu X, Shen HT (2018) Dual conditional gans for face aging and rejuvenation. In: IJCAI, pp 899–905
Song L, Lu Z, He R, Sun Z, Tan T (2018) Geometry guided adversarial facial expression synthesis. In: 2018 ACM multimedia conference on multimedia conference. ACM, pp 627–635
Suo J, Zhu SC, Shan S, Chen X (2010) A compositional and dynamic model for face aging. IEEE Trans Pattern Anal Mach Intell 32(3):385–401
Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708
Taylor L, Nitschke G (2018) Improving deep learning using generic data augmentation. In: 2018 IEEE symposium series on computational intelligence (SSCI). IEEE
Thies J, Zollhöfer M, Nießner M, Valgaerts L, Stamminger M, Theobalt C (2015) Real-time expression transfer for facial reenactment. ACM Trans Graph 34(6):183-1
Tian Y, Peng X, Zhao L, Zhang S, Metaxas DN (2018) Cr-gan: learning complete representations for multi-view generation. In: International joint conference on artificial intelligence (IJCAI)
Tran L, Liu X (2018) Nonlinear 3d face morphable model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7346–7355
Tran L, Yin X, Liu X (2017) Disentangled representation learning gan for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1415–1424
van Krevelen DWF, Poelman R (2010) A survey of augmented reality technologies, applications and limitations. Int J Virtual Real 9(2):1–20
Van den Oord A, Kalchbrenner N, Espeholt L, Vinyals O, Graves A et al (2016) Conditional image generation with pixelcnn decoders. In: Advances in neural information processing systems, pp 4790–4798
Van Den Oord A, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: Proceedings of the 33rd international conference on international conference on machine learning-volume 48, ICML’16. JMLR.org, pp 1747–1756
Volpi R, Namkoong H, Sener O, Duchi JC, Murino V, Savarese S (2018) Generalizing to unseen domains via adversarial data augmentation. In: Advances in neural information processing systems, pp 5334–5344
Wang J, Perez L (2017) The effectiveness of data augmentation in image classification using deep learning. Convolutional Neural Netw Vis Recognit 11
Wang W, Cui Z, Yan Y, Feng J, Yan S, Shu X, Sebe N (2016) Recurrent face aging. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2378–2386
Wang Z, Tang X, Luo W, Gao S (2018) Face aging with identity-preserved conditional generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7939–7947
Wiles O, Sophia Koepke A, Zisserman A (2018) X2face: A network for controlling face generation using images, audio, and pose codes. In: Proceedings of the European conference on computer vision (ECCV), pp 670–686
Winston H (2018) Investigating data augmentation strategies for advancing deep learning training. https://winstonhsu.info/wp-content/uploads/2018/03/gtc18-data_aug-180326.pdf
Wu W, Zhang Y, Li C, Qian C, Change Loy C (2018) Reenactgan: learning to reenact faces via boundary transfer. In: Proceedings of the European conference on computer vision (ECCV), pp 603–619
Wu X, He R, Sun Z, Tan T (2018) A light cnn for deep face representation with noisy labels. IEEE Trans Inf Forensics Secur 13(11):2884–2896
Xiao T, Hong J, Ma J (2018) Dna-gan: learning disentangled representations from multi-attribute images. In: International conference on learning representations workshop 2018
Xie W, Shen L, Yang M, Jiang J (2018) Facial expression synthesis with direction field preservation based mesh deformation and lighting fitting based wrinkle mapping. Multimed Tools Appl 77(6):7565–7593
Yan X, Yang J, Sohn K, Lee H (2016) Attribute2image: conditional image generation from visual attributes. In: European conference on computer vision. Springer, pp 776–791
Yang H, Huang D, Wang Y, Jain AK (2018) Learning face age progression: a pyramid architecture of gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 31–39
Yeh R, Liu Z, Goldman DB, Agarwala A (2016) Semantic facial expression editing using autoencoded flow. ArXiv preprint arXiv:1611.09961
Yin X, Yu X, Sohn K, Liu X, Chandraker M (2017) Towards large-pose face frontalization in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3990–3999
Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3359–3368
Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2018) Clothingout: a category-supervised GAN model for clothing segmentation and retrieval. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3691-y
Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 5907–5915
Zhang L, Samaras D (2006) Face recognition from a single training image under arbitrary unknown lighting using spherical harmonics. IEEE Trans Pattern Anal Mach Intell 28(3):351–363
Zhang Z, Song Y, Qi H (2017) Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5810–5818
Zhao J, Cheng Y, Cheng Y, Yang Y, Zhao F, Li J, Liu H, Yan S, Feng J (2019) Look across elapse: disentangled representation learning and photorealistic cross-age face synthesis for age-invariant face recognition. Proc AAAI Conf Artif Intell 33:9251–9258
Zhao J, Cheng Y, Xu Y, Xiong L, Li J, Zhao F, Jayashree K, Pranata S, Shen S, Xing J et al (2018) Towards pose invariant face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2207–2216
Zhao J, Xiong L, Jayashree PK, Li J, Zhao F, Wang Z, Pranata PS, Shen PS, Yan S, Feng J (2017) Dual-agent gans for photorealistic and identity preserving profile face synthesis. In: Advances in neural information processing systems, pp 66–76
Zhao J, Xiong L, Li J, Xing J, Yan S, Feng J (2018) 3d-aided dual-agent gans for unconstrained face recognition. IEEE Trans Pattern Anal Mach Intell 41(10):2380–2394
Zhou S, Xiao T, Yang Y, Feng D, He Q, He W (2017) Genegan: learning object transfiguration and attribute subspace from unpaired data. In: Proceedings of the British machine vision conference 2017
Zhou Y, Shi BE (2017) Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder. In: 2017 seventh international conference on affective computing and intelligent interaction (ACII). IEEE, pp 370–376
Zhu H, Zhou Q, Zhang J, Wang JZ (2018) Facial aging and rejuvenation by conditional multi-adversarial autoencoder with ordinal regression. ArXiv preprint arXiv:1804.02740
Zhu J.Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 146–155
Zhu X, Lei Z, Yan J, Yi D, Li SZ (2015) High-fidelity pose and expression normalization for face recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 787–796
Zhu X, Liu Y, Li J, Wan T, Qin Z (2018) Emotion classification with data augmentation using generative adversarial networks. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 349–360
Zhuang L, Yang AY, Zhou Z, Shankar Sastry S, Ma Y (2013) Single-sample face recognition with image corruption and misalignment via sparse illumination transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3546–3553
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
A summary of recent works on face data augmentation is illustrated in the following table (Table 3). It summarizes these works from the transformation type, method and the evaluations they performed to test the capability of their algorithms for data augmentation. There is one notable thing that we only label the transformation types explicitly mentioned in original papers. Maybe the methods have the capability to be used for other transformations, but the authors did not mention in their original paper.
Rights and permissions
About this article
Cite this article
Wang, X., Wang, K. & Lian, S. A survey on face data augmentation for the training of deep neural networks. Neural Comput & Applic 32, 15503–15531 (2020). https://doi.org/10.1007/s00521-020-04748-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04748-3