Video-Based People Tracking

Marcus A. Brubaker⁴,
Leonid Sigal⁴ &
David J. Fleet⁴

Abstract

Vision-based human pose tracking promises to be a key enabling technology for myriad applications, including the analysis of human activities for perceptive environments and novel man-machine interfaces. While progress toward that goal has been exciting, and limited applications have been demonstrated, the recovery of human pose from video in unconstrained settings remains challenging. One of the key challenges stems from the complexity of the human kinematic structure itself. The sheer number and variety of joints in the human body (the nature of which is an active area of biomechanics research) entails the estimation of many parameters. The estimation problem is also challenging because muscles and other body tissues obscure the skeletal structure, making it impossible to directly observe the pose of the skeleton.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Research and Analysis of Video-Based Human Pose Estimation

Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera

Recursive Bayesian Filtering for Multiple Human Pose Tracking from Multiple Cameras

References

Agarwal A, Triggs B (2006) Recovering 3d human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1):44–58
Article Google Scholar
Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, Davis J (2005) SCAPE: Shape Completion and Animation of People. ACM Transactions on Graphics 24(3):408–416
Article Google Scholar
Balan A, Black MJ (2008) The naked truth: Estimating body shape under clothing. In: IEEE European Conference on Computer Vision
Google Scholar
Balan AO, Sigal L, Black MJ, Davis JE, Haussecker HW (2007) Detailed human shape and pose from images. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Barrow HG, Tenenbaum JM, Bolles RC, Wolf HC (1977) Parametric correspondenceand chamfer matching: Two new techniques for image matching. In: International Joint Conference on Artificial Intelligence, pp 659–663
Google Scholar
Bo L, Sminchisescu C, Kanaujia A, Metaxas D (2008) Fast algorithms for large scale conditional 3d prediction. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Brubaker M, Fleet DJ (2008) The kneed walker for human pose tracking. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Brubaker M, Fleet DJ, Hertzmann A (2007) Physics-based person tracking using simplified lower-body dynamics. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Choo K, Fleet DJ (2001) People tracking using hybrid Monte Carlo filtering. In: IEEE International Conference on Computer Vision, vol II, pp 321–328
Google Scholar
Corazza S, Muendermann L, Chaudhari A, Demattio T, Cobelli C, Andriacchi T (2006) A markerless motion capture system to study musculoskeletal biomechanics: visual hull and simulated annealing approach. Annals of Biomedical Engineering 34(6):1019–1029
Article Google Scholar
de la Gorce M, Paragos N, Fleet DJ (2008) Model-based hand tracking with texture, shading and self-occlusions. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Demirdjian D, Ko T, Darrell T (2005) Untethered gesture acquisition and recognition for virtual world manipulation. Virtual Reality 8(4):222–230
Article Google Scholar
Deutscher J, Reid I (2005) Articulated body motion capture by stochastic search. International Journal of Computer Vision 61(2):185–205
Article Google Scholar
Doucet A, Godsill S, Andrieu C (2000) On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing 10(3):197–208
Article Google Scholar
Felzenszwalb P, Huttenlocher DP (2005) Pictorial structures for object recognition. International Journal of Computer Vision 61(1):55–79
Article Google Scholar
Forsyth DA, Ponce J (2003) Computer Vision: A Modern Approach. Prentice Hall
Google Scholar
Forsyth DA, Arikan O, Ikemoto L, O’Brien J, Ramanan D (2006) Computational studies of human motion: Part 1, tracking and motion synthesis. Foundations and Trends in Computer Graphics and Vision 1(2&3):1–255
Google Scholar
Gall J, Potthoff J, Schnorr C, Rosenhahn B, Seidel HP (2007) Interacting and annealing particle filters: Mathematics and a recipe for applications. Journal of Mathematical Imaging and Vision 28:1–18
Article MathSciNet Google Scholar
Gavrila DM, Davis LS (1996) 3-D model-based tracking of humans in action: a multi-view approach. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 73–80
Google Scholar
Gordon N, Salmond DJ, Smith AFM(1993) Novel approach to nonlinear/non- Gaussian Bayesian state estimation. IEE Proceedings Part F Radar and signal processing 140:107–113
Google Scholar
Grassia FS (1998) Practical parameterization of rotations using the exponential map. Journal of Graphics Tools 3(3):29–48
Google Scholar
Herda L, Urtasun R, Fua P (2005) Hierarchical implicit surface joint limits for human body tracking. Computer Vision and Image Understanding 99(2):189–209
Article Google Scholar
Horprasert T, Harwood D, Davis L (1999) A statistical approach for realtime robust background subtraction and shadow detection. In: FRAME-RATE: Frame-rate applications, methods and experiences with regularly available technology and equipment
Google Scholar
Howe N (2007) Silhouette lookup for monocular 3d pose tracking. Image and Vision Computing 25:331–341
Article Google Scholar
Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorf distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(9):850–863
Article Google Scholar
Isard M, Blake A (1998) CONDENSATION - conditional density propagation for visual tracking. International Journal of Computer Vision 29(1):5–28
Article Google Scholar
Isard M, MacCormick J (2001) BraMBLe: a bayesian multiple-blob tracker. In: IEEE International Conference on Computer Vision, vol 2, pp 34–41
Google Scholar
Jepson AD, Fleet DJ, El-Maraghi TF (2003) Robust Online Appearance Models for Visual Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(25):1296–1311
Article Google Scholar
Kakadiaris L, Metaxas D (2000) Model-based estimation of 3D human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12):1453–1459
Article Google Scholar
Kanaujia A, Sminchisescu C, Metaxas D (2007) Semi-supervised hierarchical models for 3d human pose reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Kanaujia A, Sminchisescu C, Metaxas D (2007) Spectral latent variable models for perceptual inference. In: IEEE International Conference on Computer Vision
Google Scholar
Kollnig H, Nagel HH (1997) 3d pose estimation by directly matching polyhedral models to gray value gradients. International Journal of Computer Vision 23(3):283–302
Article Google Scholar
Kong A, Liu JS, Wong WH (1994) Sequential imputations and bayesian missing data problems. Journal of the American Statistical Association 89(425):278–288
Article MATH Google Scholar
Kuipers JB (2002) Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace, and Virtual Reality. Princeton University Press
Google Scholar
Lee CS, Elgammal A (2007) Modeling view and posture manifolds for tracking. In: IEEE International Conference on Computer Vision
Google Scholar
Li R, Tian TP, Sclaroff S (2007) Simultaneous learning of non-linear manifold and dynamical models for high-dimensional time series. In: IEEE International Conference on Computer Vision
Google Scholar
Metaxas D, Terzopoulos D (1993) Shape and nonrigid motion estimation through physics-based synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(6):580–591
Article Google Scholar
Moeslund TB, Hilton A, Krüger V (2006) A survey of advances in visionbased human motion capture and analysis. Computer Vision and Image Understanding 104(2-3):90–126
Article Google Scholar
Mori G, Malik J (2002) Estimating human body configurations using shape context matching. In: IEEE European Conference on Computer Vision, pp 666–680
Google Scholar
Navaratnam R, Fitzgibbon A, Cipolla R (2007) The joint manifold model for semi-supervised multi-valued regression. In: IEEE International Conference on Computer Vision
Google Scholar
Neal RM(1993) Probabilistic inference using markov chain monte carlo methods. Tech. Rep. CRG-TR-93-1, Department of Computer Science, University of Toronto
Google Scholar
Neal RM (2001) Annealed importance sampling. Statistics and Computing 11:125–139
Article MathSciNet Google Scholar
Nestares O, Fleet DJ (2001) Probabilistic tracking of motion boundaries with spatiotemporal predictions. In: IEEE Conference on Computer Vision and Pattern Recognition, vol II, pp 358–365
Google Scholar
Ning H, XuW, Gong Y, Huang TS (2008) Latent pose estimator for continuous action recognition. In: IEEE European Conference on Computer Vision
Google Scholar
North B, Blake A (1997) Using expectation-maximisation to learn dynamical models from visual data. In: British Machine Vision Conference
Google Scholar
Pavolvic V, Rehg J, Cham TJ, Murphy K (1999) A dynamic bayesian network approach to figure tracking using learned dynamic models. In: IEEE International Conference on Computer Vision, pp 94–101
Google Scholar
Plankers R, Fua P (2001) Articulated soft objects for video-based body modeling. In: IEEE InternationalConference on Computer Vision, vol 1, pp 394–401
Google Scholar
Poon E, Fleet DJ (2002) Hybrid Monte Carlo filtering: edge-based people tracking. In: Workshop on Motion and Video Computing, pp 151–158
Google Scholar
Prati A, Mikic I, Trivedi MM, Cucchiara R (2003) Detecting moving shadows: Algorithms and evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(7):918–923
Article Google Scholar
Ramanan D, Forsyth DA, Zisserman A (2007) Tracking people by learning their appearance. IEEE Transactions on Pattern Analysis and Machine Intelligence 29:65–81
Article Google Scholar
Rehg J, Kanade T (1995) Model-based tracking of self-occluding articulated objects. In: IEEE International Conference on Computer Vision, pp 612–617
Google Scholar
Ren L, Shakhnarovich G, Hodgins J, Pfister H, Viola P (2005) Learning silhouette features for control of human motion. ACM Transactions on Graphics 24(4):1303–1331
Article Google Scholar
Rosales R, Sclaroff S (2002) Learning body pose via specialized maps. In: Advances in Neural Information Processing Systems
Google Scholar
Rosenhahn B, Kersting U, Powel K, Seidel HP (2006) Cloth X-Ray: MoCap of people wearing textiles. In: Pattern Recognition, DAGM 86 Marcus A. Brubaker, Leonid Sigal and David J. Fleet
Google Scholar
Shakhnarovich G, Viola P, Darrell TJ (2003) Fast pose estimation with parameter-sensitive hashing. In: IEEE International Conference on Computer Vision, pp 750–757
Google Scholar
Sidenbladh H, Black M, Fleet D (2000) Stochastic tracking of 3d human figures using 2d image motion. In: IEEE European Conference on Computer Vision, vol 2, pp 702–718
Google Scholar
Sidenbladh H, Black MJ, Sigal L (2002) Implicit probabilistic models of human motion for synthesis and tracking. In: IEEE European Conference on Computer Vision, vol 1, pp 784–800
Google Scholar
Sigal L, Black MJ (2006) Measure locally, reason globally: Occlusionsensitive articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 2041–2048
Google Scholar
Sigal L, Balan A, Black MJ (2007) Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Advances in Neural Information Processing Systems
Google Scholar
Sminchisescu C, Jepson A (2004) Generative modeling for continuous nonlinearly embedded visual inference. In: International Conference on Machine Learning, pp 759–766
Google Scholar
Sminchisescu C, Kanaujia A, Li Z, Metaxas D (2005) Discriminative density propagation for 3d human motion estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 1, pp 390–397
Google Scholar
Sminchisescu C, Kanajujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3d visual inference. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 1743–1752
Google Scholar
Stauffer C, Grimson W (1999) Adaptive background mixture models for realtime tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 246–252
Google Scholar
Stenger BDR (2004) Model-based hand tracking using a hierarchical bayesian filter. PhD thesis, University of Cambridge
Google Scholar
Sukel K, Catrambone R, Essa I, Brostow G (2003) Presenting movement in a computer-based dance tutor. International Journal of Human-Computer Interaction 15(3):433–452
Article Google Scholar
Taylor CJ (2000) Reconstruction of articulated objects from point correspondences in a single uncalibrated image. Computer Vision and Image Understanding 80(10):349–363
Article MATH Google Scholar
Tomasi C, Kanade T (1991) Detection and tracking of point features. Tech. Rep. CMU-CS-91-132, Carnegie Mellon University
Google Scholar
Urtasun R, Darrell T (2008) Local probabilistic regression for activityindependent human pose inference. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Urtasun R, Fleet DJ, Hertzmann A, Fua P (2005) Priors for people tracking from small training sets. In: IEEE International Conference on Computer Vision, vol 1, pp 403–410
Google Scholar
Urtasun R, Fleet DJ, Fua P (2006) 3D people tracking with gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 1, pp 238–245
Google Scholar
Urtasun R, Fleet DJ, Fua P (2006) Motion models for 3D people tracking. Computer Vision and Image Understanding 104(2-3):157–177
Article Google Scholar
Vondrak M, Sigal L, Jenkins OC (2008) Physical simulation for probabilistic motion tracking. In: IEEE Conference on Computer Vision and Pattern Recognition
Google Scholar
Wachter S, Nagel HH (1999) Tracking persons in monocular image sequences. Computer Vision and Image Understanding 74(3):174–192
Article Google Scholar
Wang JM, Fleet DJ, Hertzmann A (2006) Gaussian process dynamical models. In: Advances in Neural Information Processing Systems 18, pp 1441–1448
Google Scholar
Wren CR, Pentland A (1998) Dynamic models of human motion. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp 22–27
Google Scholar
Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7):780–785
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Toronto, Canada
Marcus A. Brubaker, Leonid Sigal & David J. Fleet

Authors

Marcus A. Brubaker
View author publications
You can also search for this author in PubMed Google Scholar
Leonid Sigal
View author publications
You can also search for this author in PubMed Google Scholar
David J. Fleet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Marcus A. Brubaker , Leonid Sigal or David J. Fleet .

Editor information

Editors and Affiliations

Future University Hakodate, Kameda-Nakano 116-2, Hakodate, Hokkaido, 041-8655, Japan
Hideyuki Nakashima
Department of Electrical Engineering, Stanford University, 350 Serra Mall, Stanford, CA, 94305-9515, USA
Hamid Aghajan
School of Computing & Mathematics, University of Ulster at Jordanstown, Shore Road, Newtownabbey, Co. Antrim, UK, BT37 0QB
Juan Carlos Augusto

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brubaker, M.A., Sigal, L., Fleet, D.J. (2010). Video-Based People Tracking. In: Nakashima, H., Aghajan, H., Augusto, J.C. (eds) Handbook of Ambient Intelligence and Smart Environments. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-93808-0_3

Download citation

DOI: https://doi.org/10.1007/978-0-387-93808-0_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-93807-3
Online ISBN: 978-0-387-93808-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Video-Based People Tracking

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Research and Analysis of Video-Based Human Pose Estimation

Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera

Recursive Bayesian Filtering for Multiple Human Pose Tracking from Multiple Cameras

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Video-Based People Tracking

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Research and Analysis of Video-Based Human Pose Estimation

Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera

Recursive Bayesian Filtering for Multiple Human Pose Tracking from Multiple Cameras

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation