Enhancing face detection in video sequences by video segmentation preprocessing

Huibin Liu¹,
Zuoxun Fan ORCID: orcid.org/0000-0002-7444-9471²,
Qiang Chen¹ &
…
Xiaomei Zhang¹

519 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

In recent years, some learning-based methods are proposed to detect and locate humans in real-time via convolutional neural networks (CNN). However, high-performance graphics processing units (GPUs) are required in those methods. To resolve this problem, a preprocessing procedure based on video segmentation is proposed to speed up face detection. Meanwhile, an accelerating toolkit is employed in this study to perform face detection in real-time on a standard central processing unit (CPU). Experimental results indicate that the proposed method can achieve an F1-Score of 93.2% and 4.5 times of real-time speed with one CPU on 155883 test frames from the RAI dataset, YouTube, and YOUKU. Notably, when the video sequence is with fewer frames of human faces, the highest speed is nearly 18 times faster than that without video segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective multiple person recognition in random video sequences using a convolutional neural network

Article 09 February 2019

Face Mask Extraction in Video Sequence

Article Open access 16 November 2018

3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification

Article 25 September 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Liu Q, He Z, Li X, Zheng Y (2020) Ptb-tir: A thermal infrared pedestrian tracking benchmark. IEEE Trans Multimedia 22(3):666–675. https://doi.org/10.1109/TMM.2019.2932615
Article Google Scholar
Yang H, Liu L, Min W, Yang X, Xiong X (2021) Driver yawning detection based on subtle facial action recognition. IEEE Trans Multimedia 23:572–583. https://doi.org/10.1109/TMM.2020.2985536
Article Google Scholar
Tian F, Gao Y, Fang Z, Fang Y, Gu J, Fujita H, Hwang J-N (2021) Depth estimation using a self-supervised network based on cross-layer feature fusion and the quadtree constraint. IEEE Trans Circuits Syst Video Technol
Wu D, Sun D-W (2013) Colour measurements by computer vision for food quality control–a review. Trends Food Sci Technol 29(1):5–20
Article Google Scholar
Samaiya D, Gupta KK (2018) Intelligent video surveillance for real time energy savings in smart buildings using hevc compressed domain features. Multimed Tools Appl 77(21):29059–29076
Article Google Scholar
Hui-bin L, Fei W, Qiang C, Yong P (2016) Recognition of individual object in focus people group based on deep learning. In: 2016 International conference on audio, language and image processing (ICALIP). IEEE, pp 615–619
Gao Y, Villecco F, Li M, Song W (2017) Multi-scale permutation entropy based on improved lmd and hmm for rolling bearing diagnosis. Entropy 19(4):176
Article Google Scholar
Zhao Y, Li H, Wan S, Sekuboyina A, Hu X, Tetteh G, Piraud M, Menze B (2019) Knowledge-aided convolutional neural network for small organ segmentation. IEEE J Biomed Health Inform 23(4):1363–1373
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowl-Based Syst 194:105590
Article Google Scholar
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302
Jung I, Son J, Baek M, Han B (2018) Real-time mdnet. In: Proceedings of the European conference on computer vision (ECCV), pp 83–98
Liu H, Tan T-H, Kuo T-Y (2019) A novel shot detection approach based on orb fused with structural similarity. IEEE Access 8:2472–2481
Article Google Scholar
Ding S, Qu S, Xi Y, Wan S (2019) A long video caption generation algorithm for big video data retrieval. Futur Gener Comput Syst 93:583–595
Article Google Scholar
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 International conference on computer vision. Ieee, pp 2564–2571
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error measurement to structural similarity. IEEE Trans Image Process 13(1)
AImageLab (2021) Rai dataset https://aimagelab.ing.unimore.it/imagelab/researchActivity.asp?idActivity=19
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Li SZ, Zhang Z (2004) Floatboost learning and statistical face detection. IEEE Trans Pattern Anal Mach Intell 26(9):1112–1123
Article Google Scholar
Huang C, Ai H, Li Y, Lao S (2007) High-performance rotation invariant multiview face detection. IEEE Trans Pattern Anal Mach Intell 29(4):671–686
Article Google Scholar
Jiang H, Learned-Miller E (2017) Face detection with the faster r-cnn. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017). IEEE, pp 650–657
Zhang S, Wang X, Lei Z, Li SZ (2019) Faceboxes: A cpu real-time and accurate unconstrained face detector. Neurocomputing 364:297–309
Article Google Scholar
Deng J, Guo J, Ververas E, Kotsia I, Zafeiriou S (2020) Retinaface: Single-shot multi-level face localisation in the wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5203–5212
Intel (2021) Model Zoo https://docs.openvinotoolkit.org/2019_R1/_face_detection_adas_0001_description_face_detection_adas_0001.html
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
GG LP, Domnic S (2014) Walsh–hadamard transform kernel-based feature vector for shot boundary detection. IEEE Trans Image Process 23(12):5187–5197
Article MathSciNet MATH Google Scholar
Mori G, Belongie S, Malik J (2005) Efficient shape matching using shape contexts. IEEE Trans Pattern Anal Mach Intell 27(11):1832–1837
Article MATH Google Scholar
Krishnapuram R, Medasani S, Jung S-H, Choi Y-S, Balasubramaniam R (2004) Content-based image retrieval based on a fuzzy approach. IEEE Trans Knowl Data Eng 16(10):1185–1199
Article Google Scholar
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: European conference on computer vision. Springer, pp 430–443
Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: Binary robust independent elementary features. In: European conference on computer vision. Springer, pp 778–792
Intel (2021) OpenVINO Toolkit https://software.intel.com/en-us/openvino-toolkit
Kozlov A, Osokin D (2019) Development of real-time adas object detector for deployment on cpu. In: Proceedings of SAI intelligent systems conference. Springer, pp 740–750
Osokin D (2018) Real-time 2d multi-person pose estimation on cpu: Lightweight openpose. arXiv:1811.12004
YouTube (2019) Youtube https://www.youtube.com/watch?v=no-ZR7-x76s
YOUKU (2021) YOUKU. https://v.youku.com/v_show/id_XOTU0NzIzMTQw.html?spm=a-2h0k.114173-42.soresults.dtitle
YOUKU (2021) YOUKU. https://v.youku.com/v_show/id_XNjE2NDk4OTY=.html?spm=a2h0k.11417342.soresults.dtitle

Download references

Author information

Authors and Affiliations

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Longteng, Shanghai, 201620, China
Huibin Liu, Qiang Chen & Xiaomei Zhang
Infrastructure Construction Department, Shanghai University of Engineering Science, Longteng, Shanghai, 201620, China
Zuoxun Fan

Authors

Huibin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zuoxun Fan
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaomei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zuoxun Fan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Fan, Z., Chen, Q. et al. Enhancing face detection in video sequences by video segmentation preprocessing. Appl Intell 53, 2897–2907 (2023). https://doi.org/10.1007/s10489-022-03608-y

Download citation

Accepted: 09 April 2022
Published: 14 May 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03608-y

Enhancing face detection in video sequences by video segmentation preprocessing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective multiple person recognition in random video sequences using a convolutional neural network

Face Mask Extraction in Video Sequence

3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Enhancing face detection in video sequences by video segmentation preprocessing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective multiple person recognition in random video sequences using a convolutional neural network

Face Mask Extraction in Video Sequence

3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now