Nothing Special   »   [go: up one dir, main page]

Skip to main content

Abstract

Vision-based human pose tracking promises to be a key enabling technology for myriad applications, including the analysis of human activities for perceptive environments and novel man-machine interfaces. While progress toward that goal has been exciting, and limited applications have been demonstrated, the recovery of human pose from video in unconstrained settings remains challenging. One of the key challenges stems from the complexity of the human kinematic structure itself. The sheer number and variety of joints in the human body (the nature of which is an active area of biomechanics research) entails the estimation of many parameters. The estimation problem is also challenging because muscles and other body tissues obscure the skeletal structure, making it impossible to directly observe the pose of the skeleton.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agarwal A, Triggs B (2006) Recovering 3d human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1):44–58

    Article  Google Scholar 

  2. Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, Davis J (2005) SCAPE: Shape Completion and Animation of People. ACM Transactions on Graphics 24(3):408–416

    Article  Google Scholar 

  3. Balan A, Black MJ (2008) The naked truth: Estimating body shape under clothing. In: IEEE European Conference on Computer Vision

    Google Scholar 

  4. Balan AO, Sigal L, Black MJ, Davis JE, Haussecker HW (2007) Detailed human shape and pose from images. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  5. Barrow HG, Tenenbaum JM, Bolles RC, Wolf HC (1977) Parametric correspondenceand chamfer matching: Two new techniques for image matching. In: International Joint Conference on Artificial Intelligence, pp 659–663

    Google Scholar 

  6. Bo L, Sminchisescu C, Kanaujia A, Metaxas D (2008) Fast algorithms for large scale conditional 3d prediction. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  7. Brubaker M, Fleet DJ (2008) The kneed walker for human pose tracking. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  8. Brubaker M, Fleet DJ, Hertzmann A (2007) Physics-based person tracking using simplified lower-body dynamics. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  9. Choo K, Fleet DJ (2001) People tracking using hybrid Monte Carlo filtering. In: IEEE International Conference on Computer Vision, vol II, pp 321–328

    Google Scholar 

  10. Corazza S, Muendermann L, Chaudhari A, Demattio T, Cobelli C, Andriacchi T (2006) A markerless motion capture system to study musculoskeletal biomechanics: visual hull and simulated annealing approach. Annals of Biomedical Engineering 34(6):1019–1029

    Article  Google Scholar 

  11. de la Gorce M, Paragos N, Fleet DJ (2008) Model-based hand tracking with texture, shading and self-occlusions. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  12. Demirdjian D, Ko T, Darrell T (2005) Untethered gesture acquisition and recognition for virtual world manipulation. Virtual Reality 8(4):222–230

    Article  Google Scholar 

  13. Deutscher J, Reid I (2005) Articulated body motion capture by stochastic search. International Journal of Computer Vision 61(2):185–205

    Article  Google Scholar 

  14. Doucet A, Godsill S, Andrieu C (2000) On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing 10(3):197–208

    Article  Google Scholar 

  15. Felzenszwalb P, Huttenlocher DP (2005) Pictorial structures for object recognition. International Journal of Computer Vision 61(1):55–79

    Article  Google Scholar 

  16. Forsyth DA, Ponce J (2003) Computer Vision: A Modern Approach. Prentice Hall

    Google Scholar 

  17. Forsyth DA, Arikan O, Ikemoto L, O’Brien J, Ramanan D (2006) Computational studies of human motion: Part 1, tracking and motion synthesis. Foundations and Trends in Computer Graphics and Vision 1(2&3):1–255

    Google Scholar 

  18. Gall J, Potthoff J, Schnorr C, Rosenhahn B, Seidel HP (2007) Interacting and annealing particle filters: Mathematics and a recipe for applications. Journal of Mathematical Imaging and Vision 28:1–18

    Article  MathSciNet  Google Scholar 

  19. Gavrila DM, Davis LS (1996) 3-D model-based tracking of humans in action: a multi-view approach. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 73–80

    Google Scholar 

  20. Gordon N, Salmond DJ, Smith AFM(1993) Novel approach to nonlinear/non- Gaussian Bayesian state estimation. IEE Proceedings Part F Radar and signal processing 140:107–113

    Google Scholar 

  21. Grassia FS (1998) Practical parameterization of rotations using the exponential map. Journal of Graphics Tools 3(3):29–48

    Google Scholar 

  22. Herda L, Urtasun R, Fua P (2005) Hierarchical implicit surface joint limits for human body tracking. Computer Vision and Image Understanding 99(2):189–209

    Article  Google Scholar 

  23. Horprasert T, Harwood D, Davis L (1999) A statistical approach for realtime robust background subtraction and shadow detection. In: FRAME-RATE: Frame-rate applications, methods and experiences with regularly available technology and equipment

    Google Scholar 

  24. Howe N (2007) Silhouette lookup for monocular 3d pose tracking. Image and Vision Computing 25:331–341

    Article  Google Scholar 

  25. Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorf distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(9):850–863

    Article  Google Scholar 

  26. Isard M, Blake A (1998) CONDENSATION - conditional density propagation for visual tracking. International Journal of Computer Vision 29(1):5–28

    Article  Google Scholar 

  27. Isard M, MacCormick J (2001) BraMBLe: a bayesian multiple-blob tracker. In: IEEE International Conference on Computer Vision, vol 2, pp 34–41

    Google Scholar 

  28. Jepson AD, Fleet DJ, El-Maraghi TF (2003) Robust Online Appearance Models for Visual Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(25):1296–1311

    Article  Google Scholar 

  29. Kakadiaris L, Metaxas D (2000) Model-based estimation of 3D human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12):1453–1459

    Article  Google Scholar 

  30. Kanaujia A, Sminchisescu C, Metaxas D (2007) Semi-supervised hierarchical models for 3d human pose reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  31. Kanaujia A, Sminchisescu C, Metaxas D (2007) Spectral latent variable models for perceptual inference. In: IEEE International Conference on Computer Vision

    Google Scholar 

  32. Kollnig H, Nagel HH (1997) 3d pose estimation by directly matching polyhedral models to gray value gradients. International Journal of Computer Vision 23(3):283–302

    Article  Google Scholar 

  33. Kong A, Liu JS, Wong WH (1994) Sequential imputations and bayesian missing data problems. Journal of the American Statistical Association 89(425):278–288

    Article  MATH  Google Scholar 

  34. Kuipers JB (2002) Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace, and Virtual Reality. Princeton University Press

    Google Scholar 

  35. Lee CS, Elgammal A (2007) Modeling view and posture manifolds for tracking. In: IEEE International Conference on Computer Vision

    Google Scholar 

  36. Li R, Tian TP, Sclaroff S (2007) Simultaneous learning of non-linear manifold and dynamical models for high-dimensional time series. In: IEEE International Conference on Computer Vision

    Google Scholar 

  37. Metaxas D, Terzopoulos D (1993) Shape and nonrigid motion estimation through physics-based synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15(6):580–591

    Article  Google Scholar 

  38. Moeslund TB, Hilton A, Krüger V (2006) A survey of advances in visionbased human motion capture and analysis. Computer Vision and Image Understanding 104(2-3):90–126

    Article  Google Scholar 

  39. Mori G, Malik J (2002) Estimating human body configurations using shape context matching. In: IEEE European Conference on Computer Vision, pp 666–680

    Google Scholar 

  40. Navaratnam R, Fitzgibbon A, Cipolla R (2007) The joint manifold model for semi-supervised multi-valued regression. In: IEEE International Conference on Computer Vision

    Google Scholar 

  41. Neal RM(1993) Probabilistic inference using markov chain monte carlo methods. Tech. Rep. CRG-TR-93-1, Department of Computer Science, University of Toronto

    Google Scholar 

  42. Neal RM (2001) Annealed importance sampling. Statistics and Computing 11:125–139

    Article  MathSciNet  Google Scholar 

  43. Nestares O, Fleet DJ (2001) Probabilistic tracking of motion boundaries with spatiotemporal predictions. In: IEEE Conference on Computer Vision and Pattern Recognition, vol II, pp 358–365

    Google Scholar 

  44. Ning H, XuW, Gong Y, Huang TS (2008) Latent pose estimator for continuous action recognition. In: IEEE European Conference on Computer Vision

    Google Scholar 

  45. North B, Blake A (1997) Using expectation-maximisation to learn dynamical models from visual data. In: British Machine Vision Conference

    Google Scholar 

  46. Pavolvic V, Rehg J, Cham TJ, Murphy K (1999) A dynamic bayesian network approach to figure tracking using learned dynamic models. In: IEEE International Conference on Computer Vision, pp 94–101

    Google Scholar 

  47. Plankers R, Fua P (2001) Articulated soft objects for video-based body modeling. In: IEEE InternationalConference on Computer Vision, vol 1, pp 394–401

    Google Scholar 

  48. Poon E, Fleet DJ (2002) Hybrid Monte Carlo filtering: edge-based people tracking. In: Workshop on Motion and Video Computing, pp 151–158

    Google Scholar 

  49. Prati A, Mikic I, Trivedi MM, Cucchiara R (2003) Detecting moving shadows: Algorithms and evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(7):918–923

    Article  Google Scholar 

  50. Ramanan D, Forsyth DA, Zisserman A (2007) Tracking people by learning their appearance. IEEE Transactions on Pattern Analysis and Machine Intelligence 29:65–81

    Article  Google Scholar 

  51. Rehg J, Kanade T (1995) Model-based tracking of self-occluding articulated objects. In: IEEE International Conference on Computer Vision, pp 612–617

    Google Scholar 

  52. Ren L, Shakhnarovich G, Hodgins J, Pfister H, Viola P (2005) Learning silhouette features for control of human motion. ACM Transactions on Graphics 24(4):1303–1331

    Article  Google Scholar 

  53. Rosales R, Sclaroff S (2002) Learning body pose via specialized maps. In: Advances in Neural Information Processing Systems

    Google Scholar 

  54. Rosenhahn B, Kersting U, Powel K, Seidel HP (2006) Cloth X-Ray: MoCap of people wearing textiles. In: Pattern Recognition, DAGM 86 Marcus A. Brubaker, Leonid Sigal and David J. Fleet

    Google Scholar 

  55. Shakhnarovich G, Viola P, Darrell TJ (2003) Fast pose estimation with parameter-sensitive hashing. In: IEEE International Conference on Computer Vision, pp 750–757

    Google Scholar 

  56. Sidenbladh H, Black M, Fleet D (2000) Stochastic tracking of 3d human figures using 2d image motion. In: IEEE European Conference on Computer Vision, vol 2, pp 702–718

    Google Scholar 

  57. Sidenbladh H, Black MJ, Sigal L (2002) Implicit probabilistic models of human motion for synthesis and tracking. In: IEEE European Conference on Computer Vision, vol 1, pp 784–800

    Google Scholar 

  58. Sigal L, Black MJ (2006) Measure locally, reason globally: Occlusionsensitive articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 2041–2048

    Google Scholar 

  59. Sigal L, Balan A, Black MJ (2007) Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Advances in Neural Information Processing Systems

    Google Scholar 

  60. Sminchisescu C, Jepson A (2004) Generative modeling for continuous nonlinearly embedded visual inference. In: International Conference on Machine Learning, pp 759–766

    Google Scholar 

  61. Sminchisescu C, Kanaujia A, Li Z, Metaxas D (2005) Discriminative density propagation for 3d human motion estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 1, pp 390–397

    Google Scholar 

  62. Sminchisescu C, Kanajujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3d visual inference. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 1743–1752

    Google Scholar 

  63. Stauffer C, Grimson W (1999) Adaptive background mixture models for realtime tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 2, pp 246–252

    Google Scholar 

  64. Stenger BDR (2004) Model-based hand tracking using a hierarchical bayesian filter. PhD thesis, University of Cambridge

    Google Scholar 

  65. Sukel K, Catrambone R, Essa I, Brostow G (2003) Presenting movement in a computer-based dance tutor. International Journal of Human-Computer Interaction 15(3):433–452

    Article  Google Scholar 

  66. Taylor CJ (2000) Reconstruction of articulated objects from point correspondences in a single uncalibrated image. Computer Vision and Image Understanding 80(10):349–363

    Article  MATH  Google Scholar 

  67. Tomasi C, Kanade T (1991) Detection and tracking of point features. Tech. Rep. CMU-CS-91-132, Carnegie Mellon University

    Google Scholar 

  68. Urtasun R, Darrell T (2008) Local probabilistic regression for activityindependent human pose inference. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  69. Urtasun R, Fleet DJ, Hertzmann A, Fua P (2005) Priors for people tracking from small training sets. In: IEEE International Conference on Computer Vision, vol 1, pp 403–410

    Google Scholar 

  70. Urtasun R, Fleet DJ, Fua P (2006) 3D people tracking with gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition, vol 1, pp 238–245

    Google Scholar 

  71. Urtasun R, Fleet DJ, Fua P (2006) Motion models for 3D people tracking. Computer Vision and Image Understanding 104(2-3):157–177

    Article  Google Scholar 

  72. Vondrak M, Sigal L, Jenkins OC (2008) Physical simulation for probabilistic motion tracking. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  73. Wachter S, Nagel HH (1999) Tracking persons in monocular image sequences. Computer Vision and Image Understanding 74(3):174–192

    Article  Google Scholar 

  74. Wang JM, Fleet DJ, Hertzmann A (2006) Gaussian process dynamical models. In: Advances in Neural Information Processing Systems 18, pp 1441–1448

    Google Scholar 

  75. Wren CR, Pentland A (1998) Dynamic models of human motion. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp 22–27

    Google Scholar 

  76. Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7):780–785

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Marcus A. Brubaker , Leonid Sigal or David J. Fleet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Brubaker, M.A., Sigal, L., Fleet, D.J. (2010). Video-Based People Tracking. In: Nakashima, H., Aghajan, H., Augusto, J.C. (eds) Handbook of Ambient Intelligence and Smart Environments. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-93808-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-93808-0_3

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-93807-3

  • Online ISBN: 978-0-387-93808-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics