research-article

3D pedestrian tracking and frontal face image capture based on head point detection

Authors:

Zhongchuan Zhang,

Fernand CohenAuthors Info & Claims

Multimedia Tools and Applications, Volume 79, Issue 1-2

Pages 737 - 764

https://doi.org/10.1007/s11042-019-08121-y

Published: 01 January 2020 Publication History

Abstract

This paper proposes a method to track pedestrians in crowded scenes and capture the close-up frontal face images of a person of interest (POI) for recognition. Pedestrians are tracked via 3D positions of the head points (the highest point of a person) using 2 static overhead cameras. Head points are located and tracked based on the geometric and color cues in the scene. Possible head areas in a frame acquired from one of the overhead cameras are determined based on projective geometry. Head areas belonging to a person are clustered. Without creating a full disparity map of the scene, the 3D position of a pedestrian is obtained by utilizing the disparity along the line segment that passes through his/her head top. The 3D head position is then tracked using common assumptions on motion velocity. If the tracking is not accurate enough, the color distribution of a head top is integrated as a complementary method. With the 3D head point information, a set of pan-tilt-zoom (PTZ) cameras are scheduled to capture the frontal face images of POI. A most suitable PTZ camera is selected by evaluating the capture quality of each PTZ camera and its current state. The approach is tested using a publicly available visual surveillance simulation test bed. The experiments show that the 3D tracking errors are around 4 cm and high quality frontal face images are captured.

References

[1]

Bellotto N, Sommerlade E, Benfold B, Bibby C, Reid I, Roth D et al (2009) A distributed camera system for multi-resolution surveillance. In ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–8

[2]

Beymer D (2000) Person counting using stereo. Workshop on Human Motion:127–133

[3]

Bimbo AD and Pernici F Towards on-line saccade planning for high-resolution image sensing Pattern Recogn Lett 2006 27 1826-1834

[4]

Boltes M and Seyfried A Collecting pedestrian trajectories Neurocomputing 2013 100 127-133

[5]

Boltes Maik, Seyfried Armin, Steffen Bernhard, and Schadschneider Andreas Automatic Extraction of Pedestrian Trajectories from Video Recordings Pedestrian and Evacuation Dynamics 2008 2009 Berlin, Heidelberg Springer Berlin Heidelberg 43-54

[6]

Brostow G, Cipolla R (2006) Unsupervised bayesian detection of independent motion in crowds. IEEE Conference on Computer Vision and Pattern Recognition:594–601

[7]

Collins R.T., Lipton A.J., Fujiyoshi H., and Kanade T. Algorithms for cooperative multisensor surveillance Proceedings of the IEEE 2001 89 10 1456-1477

[8]

Comaniciu D and Meer P Mean shift: A robust approach toward feature space analysis IEEE Trans Pattern Anal Mach Intell 2002 24 603-619

[9]

Comaniciu D, Ramesh V, and Meer P Kernel-based object tracking IEEE Trans Pattern Anal Mach Intell 2003 25 564-577

[10]

Crow Franklin C. Summed-area tables for texture mapping ACM SIGGRAPH Computer Graphics 1984 18 3 207-212

[11]

Daugman J (2002) How iris recognition works. International Conference on Image Processing:33–36

[12]

Delannay D, Danhier N, Vleeschouwer CD (2009) Detection and recognition of sports(wo)man from multiple views. In ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–7

[13]

Eshel R, Moses Y (2008) Homography based multiple camera detection and tracking of people in a dense crowd. IEEE Conference on Computer Vision and Pattern Recognition:1–8

[14]

Guo R, Dai Q, and Hoiem D Paired Regions for Shadow Detection and Removal IEEE Trans Pattern Anal Mach Intell 2013 35 2956-2967

[15]

Hampapur A, Pankanti S, Senior A, Tian Y-L, Brown L, Bolle R (2003) Face cataloger: Multi-scale imaging for relating identity to location. IEEE Conference on Advanced Video and Signal Based Surveillance:13–20

[16]

Jin Z and Bhanu B Analysis-by-synthesis: Pedestrian Tracking with Crowd Simulation Models in a Multi-camera Video Network Comput Vis Image Underst 2015 134 48-63

[17]

Kailath T The divergence and bhattacharyya distance measures in signal selection IEEE Transactions on Communication Technology 1967 15 52-60

[18]

Kawanaka H, Fujiyoshi H, Iwahori Y (2006) Human head tracking in three dimensional voxel space. International Conference on Pattern Recognition:826–829

[19]

Khan SM, Shah M (2006) A multi-view approach to tracking people in crowded scenes using a planar homography constraint. European Conference on Computer Vision:133–146

[20]

Khan SM and Shah M Tracking multiple occluding people by localizing on multiple scene planes IEEE Trans Pattern Anal Mach Intell 2009 31 505-519

[21]

Krumm J, Harris S, Meyers B, Brumitt B, Hale M, Sha S (2000) Multi-camera multi-person tracking for easy living. Third IEEE International Workshop on Visual Surveillance

[22]

Marchesotti L, Marcenaro L, Regazzoni C (2003) Dual camera system for face detection in unconstrained environments. International Conference on Image Processing:681–684

[23]

Mittal A, Larry S (2003) M2tracker: A multi-view approach to segmenting and tracking people in a cluttered scene. 51:189–203

[24]

Ning J, Zhang L, Zhang D, and Wu CScale and orientation adaptive mean shift trackingIET Comput Vis2012652-612920175

[25]

Nummiaro K, Koller-Meier E, and Van Gool L An adaptive color-based particle filter Image Vis Comput 2003 21 99-110

[26]

Oosterhout TV, Bakkes S, Kröse BJA (2011) Head detection in stereo data for people counting and segmentation. In: International Conference on Computer Vision Theory and Applications, pp. 620–625.

[27]

Oosterhout TV, Englebienne G, and Kröse B RARE: People Detection in Crowded Passages by Range Image Reconstruction Mach Vis Appl 2015 26 561-573

[28]

Oosterhout TV, Kröse BJA, Englebienne G (2012) People counting with stereo cameras - two template-based solutions. In International Conference on Computer Vision Theory and Applications (2), pp. 404–408

[29]

Orwell J, Massey S, Remagnino P, Greenhill D, Jones G (1999) A multi-agent framework for visual surveillance. IEEE International 1st Conference on Image Processing

[30]

Ozturk O, Yamasaki T, Aizawa K (2009) Tracking of humans and estimation of body/head orientation from top-view single camera for visual focus of attention analysis. International Conference on Computer Vision:1020–1027

[31]

Prince SJD, Elder JH, Hou Y, Sizinstev M (2005) Pre-attentive face detection for foveated wide-field surveillance. IEEE Workshop on Applications on Computer Vision:439–446

[32]

Qureshi FZ and Terzopoulos D Surveillance camera scheduling: A virtual vision approach Multimedia Systems 2006 12 269-283

[33]

Rougier C, Meunier J, St-Arnaud A, and Rousseau J 3d head tracking for fall detection using a single calibrated camera Image Vis Comput 2013 31 246-254

[34]

Sanin A, Sanderson C, Lovell BC (2010) Improved Shadow Removal for Robust Person Tracking in Surveillance Scenarios. International Conference on Pattern Recognition:141–144

[35]

Santos TT and Morimoto CH Multiple camera people detection and tracking using support integration Pattern Recogn Lett 2011 32 47-55

[36]

Sasi RK and Govindan VK Shadow removal using sparse representation over local dictionaries Engineering Science and Technology, an International Journal 2016 192 1067-1075

[37]

Sun L, Di H, Tao L, Xu G (2010) A robust approach for person localization in multi-camera environment. International Conference on Pattern Recognition:4036–4039

[38]

Taylor GR, Chosak AJ, Brewer PC (2007) OVVV: using virtual worlds to design and evaluate surveillance systems. IEEE Conference on Computer Vision and Pattern Recognition:1–8

[39]

Veksler O (2003) Fast variable window for stereo correspondence using integral images. IEEE Conference on Computer Vision and Pattern Recognition:556–561

[40]

Vincent L (1993) Gray scale area openings and closings, their efficient implementation and applications. Workshop on Mathematical Morphology Applications Signal Processing:22–27

[41]

Viola P and Jones M Robust real-time face detection Int J Comput Vis 2004 57 137-154

[42]

Wang J, Zhang C, Shum H (2004) Face image resolution versus face recognition performance based on two global methods. In Asia Conference on Computer Vision

[43]

Yatim HSM, Talib AZ, and Haron F An Automated Image-Based Approach for Tracking Pedestrian Movements from Top-View Video International Visual Informatics Conference 2017 279-289

[44]

Zhang Z and Cohen F Pedestrian tracking based on 3d head point detection International Conference on Computer Vision Theory and Applications 2013 2 382-385

[45]

Zhang Z, Cohen F (2013) 3d pedestrian tracking based on overhead cameras. International Conference on Distributed Smart Cameras:1–6

[46]

Zhao T and Nevatia R Tracking multiple humans in complex situations IEEE Trans Pattern Anal Mach Intell 2004 26 1208-1221

[47]

Zhou X, Collins RT, Kanade T, Metes P (2003) A master-slave system to acquire biometric imagery of humans at distance. In First ACM SIGMM international workshop on Video surveillance, pp. 113–120

Recommendations

Robust pedestrian tracking using improved tracking-learning-detection algorithm
ICVGIP '16: Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing

Manual analysis of pedestrians for surveillance of large crowds in real time applications is not practical. Tracking-Learning-Detection suggested by Kalal, Mikolajczyk and Matas [1] is one of the most prominent automatic object tracking system. TLD can ...
Pedestrian tracking using color, thermal and location cue measurements: a DSmT-based framework

In this paper, we address the problem of pedestrians tracking in cluttered scenes using location, color and thermal cues. The Dezert–Smarandache (DSm) theoretical framework is used to combine the measurements provided by the sensors into a single and ...
Robust pedestrian detection and tracking in crowded scenes

In this paper, a robust computer vision approach to detecting and tracking pedestrians in unconstrained crowded scenes is presented. Pedestrian detection is performed via a 3D clustering process within a region-growing framework. The clustering process ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Multimedia Tools and Applications

Multimedia Tools and Applications Volume 79, Issue 1-2

Jan 2020

1595 pages

ISSN:1380-7501

Issue’s Table of Contents

© Springer Science+Business Media, LLC, part of Springer Nature 2019.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 January 2020

Accepted: 13 August 2019

Revision received: 07 April 2019

Received: 07 May 2018

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents