Abstract
Driver’s face detection and alignment techniques in Intelligent Transportation System (ITS) under unlimited environment are challenging issues, which are conductive to supervising traffic order and maintaining public safety. This paper proposes the improved Multi-task Cascaded Convolutional Networks (ITS-MTCNN) to realize accurate face region detection and feature alignment of driver’s face on highway, predicting face and feature location via a coarse-to-fine pattern. Moreover, the improved regularization method and effective online hard sample mining technique are proposed in ITS-MTCNN method. Then, the training model and contrast experiment are conducted on our self-build traffic driver’s face database. Finally, the effectiveness of ITS-MTCNN method is validated by comparative experiments and verified under various complex highway conditions. At the same time, average alignment errors on left eye, right eye, nose, left mouth as well as right mouth of the proposed technique are performed. Experimental results show that ITS-MTCNN model shows satisfied performance compared to other state-of-the-art techniques used in driver’s face detection and alignment, keeping robust to the occlusion, varying pose and extreme illumination on highway.
Similar content being viewed by others
References
Alsmirat MA, Al-Alem F, Al-Ayyoub M et al. (2018) Impact of digital fingerprint image quality on the fingerprint recognition accuracy[J]. Multimed Tools Appl
Amberg, Brian, and Thomas Vetter. "Optimal landmark detection using shape models and branch and bound." 2011 International Conference on Computer Vision. IEEE, 2011.
Atawneh S, Almomani A, Al Bazar H et al (2017) Secure and imperceptible digital image steganographic algorithm based on diamond encoding in DWT domain[J]. Multimed Tools Appl 76(18):18451–18472
Belhumeur, Peter N., et al. "Localizing parts of faces using a consensus of exemplars." IEEE transactions on pattern analysis and machine intelligence 35.12 (2013): 2930–2940.
Chen, Dong, et al. "Joint cascade face detection and alignment." European Conference on Computer Vision. Springer, Cham, 2014.
Cheng, Zhiyong, et al. "MMALFM: Explainable recommendation by leveraging reviews and images." ACM Transactions on Information Systems (TOIS) 37.2 (2019): 16.
Chiang, Hsin-Han, et al. "Embedded driver-assistance system using multiple sensors for safe overtaking maneuver." IEEE Systems Journal 8.3 (2012): 681-698.
El-Latif, Ahmed A. Abd, et al. "Efficient quantum information hiding for remote medical image sharing." IEEE Access 6 (2018): 21075–21083.
Gao S, Zhang Y, Jia K, Lu J, Zhang Y (2015) Single sample face recognition via learning deep supervised autoencoders. IEEE Trans Inf Forensics Secur 10(10):2108–2118
Guo, Yangyang, et al. "Multi-modal preference modeling for product search." 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 2018.
Gupta, Brij B., ed. Computer and cyber security: principles, algorithm, applications, and perspectives. CRC Press, 2018.
Gupta, Brij, Dharma P. Agrawal, and Shingo Yamaguchi, eds. Handbook of research on modern cryptographic solutions for computer and cyber security. IGI global, 2016.
Hu C, Lu X, Ye M, Zeng W (2017) Singular value decomposition and local near neighbors for face recognition under varying illumination. Pattern Recogn 64:60–83
Huang Y, Yao H, Zhao S et al (2015) Towards more efficient and flexible face image deblurring using robust salient face landmark detection[J]. Multimed Tools Appl 76(1):1–20
Jain, Vidit, and Erik Learned-Miller. Fddb: A benchmark for face detection in unconstrained settings. Vol. 2. No. 4. UMass Amherst Technical Report, 2010.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Li, Haoxiang, et al. "A convolutional neural network cascade for face detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
Li J, Yu C, Gupta BB et al (2018) Color image watermarking scheme based on quaternion Hadamard transform and Schur decomposition[J]. Multimed Tools Appl 77(4):4545–4561
Zou, Liming, et al. "A novel coverless information hiding method based on the average pixel value of the sub-images." Multimedia Tools and Applications (2018): 1-16.
Lv J, Shao X, Xing J, Cheng C, Zhou X (2017) A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. Proc IEEE Conf Comput Vis Pattern Recognit
Marin-Jimenez MJ, Z isserman A, Eichner M et al (2014) Detecting people looking at each other in videos[J]. Int J Comput Vis 106(3):282–296
Martinez CM, Heucke M, Wang F-Y et al (2018) Driving style recognition for intelligent vehicle control and advanced driver assistance: a survey[J]. IEEE Trans Intell Transp Syst 19(3):666–676
Neubeck, Alexander, and Luc Van Gool. "Efficient non-maximum suppression." 18th International Conference on Pattern Recognition (ICPR'06). Vol. 3. IEEE, 2006.
Pham, Minh-Tri, et al. "Fast polygonal integration and its application in extending haar-like features to improve object detection." 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2010.
Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).
Sun, Yi, Xiaogang Wang, and Xiaoou Tang. "Deep convolutional network cascade for facial point detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2013.it
Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Viola P, Jones MJ (2004) Robust real-time face detection[J]. Int J Comput Vis 57(2):137–154
Yang, Bin, et al. "Aggregate channel features for multi-view face detection." IEEE international joint conference on biometrics. IEEE, 2014.
Yang, Shuo, et al. "Wider face: A face detection benchmark." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Zeng W, Lu X (2011) Region-based nonlocal means algorithm for noise removal. Electron Lett 47:1125–1127
Zhang, Jie, et al. "Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment." European conference on computer vision. Springer, Cham, 2014.
Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Process Lett 23(10):1499–1503
Zhao S, Yao H, Sun X (2013) Video classification and recommendation based on affective analysis of viewers[J]. Neurocomputing 119:101–110
Zhao S, Yao H, Jiang X (2016) Multi-modal microblog classification via multi-task learning[J]. Multimed Tools Appl 75(15):8921–8938
Zhao S, Yao H, Gao Y et al (2017) Continuous probability distribution prediction of image emotions via multi-task shared sparse regression[J]. IEEE Trans Multimed 19(3):632–645
Zheng Q, Wang X, Khurram Khan M et al (2018) A lightweight authenticated encryption scheme based on chaotic SCML for railway cloud service[J]. IEEE Access 6:21075–21083
Ramanan, Deva, and Xiangxin Zhu. "Face detection, pose estimation, and landmark localization in the wild." Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2012.
Zhu, Qiang, et al. "Fast human detection using a cascade of histograms of oriented gradients." 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). Vol. 2. IEEE, 2006.
Acknowledgements
We would like to thank the National Natural Science Foundation Projects of China (No.61871123), National Natural Science Foundation of China (No.61374194), National Key Science and Technology Pillar Program of China (No.2014BAG01B03) Key Research and Development Program of Jiangsu Province (No.BE2016739) for funding. In addition, we would like to thank the Public Security Department of Jiangsu Province for providing PSD-HIGHROAD database.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Y., Lv, P., Lu, X. et al. Face detection and alignment method for driver on highroad based on improved multi-task cascaded convolutional networks. Multimed Tools Appl 78, 26661–26679 (2019). https://doi.org/10.1007/s11042-019-07836-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-07836-2