Abstract
Arm pose of human, plays an import role for understanding human behaviours. It can directly carry information of people identity, action style, interaction manner, personal habit etc. However the high dynamics of arm parts, especially the movement of forearms and hands, makes that modeling arm parts with high accuracy is challenging. In order to overcome this problem in a specific application, such as modeling arm pose of pedestrians, this paper adopts multiple priors to decrease the uncertainty of arm parts. Firstly, the human structure information, i.e. the prior of human arm parts size, is adopted to remove the impossible arm configuration. Secondly, the prior of arm parts configuration of a specific action is used to constrain the co-occurrence relations of all arm components. Therefore, a Bayesian approach is presented for modeling arm pose to incorporate the multiple priors and the likelihoods from visual observation. This paper proposes an arm model in which its priors can be represented easily. It also describes the priors estimation from the CMU motion dataset by a kernel density estimation, and maximum a posteriori for modeling the parameters of arm parts. Since there are priors for walking style, this method can be directly used for arm pose modeling of pedestrians without pre-training. It is found perform effectively on a HKU campus testing dataset, and also been evaluated on different human sizes and lighting conditions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Carnegie Mellon University motion capture database. http://mocap.cs.cmu.edu/.
2D articulated human pose estimation software v1.21. http://www.vision.ee.ethz.ch/calvin/, 2011.
Full code for training and testing, including buffy, parse, and inria image benchmarks. http://phoenix.ics.uci.edu/software/pose/, 2012.
Balan, A.O., Sigal, L., Black, M.J., Davis, J.E., & Haussecker, H.W. (2007). Detailed human shape and pose from images. In Proc. Comput. Vis. Pattern Recog., pages 1–8. IEEE.
Buehler, P., Everingham, M., Huttenlocher, D. P., & Zisserman, A. (2008). Long term arm and hand tracking for continuous sign language tv broadcasts. In Proc. British Mach Vis. Conf. volume 1281.
Cherian, A., Mairal, J., Alahari, K., & Schmid, C. (2014). Mixing body-part sequences for human pose estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.
Conaire, C.O., O’Connor, N.E., & Smeaton, A.F. (2007). Detector adaptation by maximising agreement between independent data sources. In Proc. Comput. Vis. Pattern Recog., pages 1–6. IEEE.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proc. Comput. Vis. Pattern Recog., volume 1, pages 886–893. IEEE.
del Rincon, J.M., Makris, D., Uruuela, C.O., & Nebel, J.-C. (2011). Tracking human position and lower body parts using kalman and particle filters constrained by human biomechanics. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, 41(1), 26–37.
Deutscher, J., Blake, A., & Reid, I. (2000). Articulated body motion capture by annealed particle filtering. In Proc. Comput. Vis. Pattern Recog., volume 2, pages 126–133. IEEE.
Doucet, A., Godsill, S., & Andrieu, C. (2000). On sequential monte carlo sampling methods for bayesian filtering. Statistics and Computing, 10(3), 197–208.
Bradley Efron (2013). Bayes’ theorem in the 21st century. Science, 340(6137), 1177–1178.
Eichner, M., & Ferrari, V. (2009). Better appearance models for pictorial structures. In Proc. British Mach. Vis. Conf.
Eichner, M., Marin-Jimenez, M., Zisserman, A., & Ferrari, V. (2010). Articulated human pose estimation and search in (almost) unconstrained still images. Technical report, ETH Zurich.
Felzenszwalb, P.F., & Huttenlocher, D.P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.
Fischler, M.A., & Elschlager, R.A. (1973). The representation and matching of pictorial structures. IEEE Transactions on Computers, 100(1), 67–92.
Forsyth, D.A., & Ponce, J. (2012). Detecting objects in images. In Computer Vision, A Modern Approach, 2nd edition, pages 519–539. Prentice Hall.
Fragkiadaki, K., Hu, H., & Shi, J. (2013). Pose from flow and flow from pose. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 2059–2066. IEEE.
Gall, J., Rosenhahn, B., Brox, T., & Seidel, H.P. (2010). Optimization and filtering for human motion capture. International Journal of Computer Vision, 87(1), 75–92.
Gu, J., Ding, X., Wang, S., & Wu, Y. (2010). Action and gait recognition from recovered 3-d human joints. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 40(4), 1021–1033.
Hjelmås, E., & Low, B.K. (2001). Face detection: A survey. Computer Vision and Image Understanding, 83(3), 236–274.
Hsu, R., Kageyama, M., Fukui, H., Nakaya, Y., & Harashima, H. (1993). Human arm modeling for analysis/synthesis image coding. In Proc. Robt. Human Commun., pages 352–355. IEEE.
Jiang, F., Zhang, S., Shen, W., Gao, Y., & Zhao, D. (2015). Multi-layered gesture recognition with kinect. Journal of Machine Learning Research, 16, 227–254.
Ju, S.X., Black, M.J., & Yacoob, Y. (1996). Cardboard people: A parameterized model of articulated image motion. In Proc. Int. Conf. Autom. Face Gesture Recog., pages 38–44. IEEE.
Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as bayesian inference. Annual Review of Psychology, 55, 271–304.
Kreutz-Delgado, K., Long, M., & Seraji, H. (1992). Kinematic analysis of 7-dof manipulators. International Journal of Robotics Research, 11(5), 469–481.
Lawrence, N. (2005). Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of Machine Learning Research, 6, 1783–1816.
Lee, M.W., & Cohen, I. (2004). Proposal maps driven mcmc for estimating human body pose in static images. In Proc. Comput. Vis. Pattern Recog., volume 2, pages II–334. IEEE.
Lewis, J.P., Cordner, M., & Fong, N. (2000). Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In Proc. SIGGRAPH, pages 165–172. ACM /Addison-Wesley.
Martin, D.R., Fowlkes, C.C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Mehrabian, A. (1968). Communication without words. Psychology Today, 2, 53–55.
Mitra, S., & Acharya, T. (2007). Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 37(3), 311–324.
Moeslund, T.B., & Granum, E. (2003). Modelling and estimating the pose of a human arm. Machine Vision and Applications, 14(4), 237–247.
Moeslund, T.B., Hilton, A., & Kruger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2-3), 90–126.
Wanli, O., Xiao, C., & Xiaogang, W. (2014). Multi-source deep learning for human pose estimation.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann.
Plankers, R., & Fua, P. (2001). Articulated soft objects for video-based body modeling. In Proc. Int. Conf. Comput. Vis., volume 1, pages 394–401. IEEE.
Ramanan, D. (2007). Learning to parse images of articulated bodies. 19, 1129.
Ramanan, D., & Sminchisescu, C. Training deformable models for localization. In Proc. Comput. Vis. Pattern Recog., volume 1, pages 206–213. IEEE.
Salti, S., Schreer, O., & Di Stefano, L. (2008). Real-time 3d arm pose estimation from monocular video for enhanced hci. In Proc. Vis. Networks Behav. Anal., pages 1–8. ACM.
Sapp, B., & Taskar, B. (2013). Multimodal decomposable models for human pose estimation. In In Proc. CVPR.
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In Proc. Comput. Vis. Pattern Recog., 1297–1304.
Sidenbladh, H., Black, M.J., & Fleet, D.J. (2000). Stochastic tracking of 3d human figures using 2d image motion. In Proc. Eur. Conf. Comput. Vis.
Sminchisescu, C., & Triggs, B. (2001). Covariance scaled sampling for monocular 3d body tracking. In Proc. Comput. Vis. Pattern Recog., volume 1, pages I–447. IEEE.
Tian, T.P., Li, R., & Sclaroff, S. (2005). Tracking human body pose on a learned smooth space. Technical report, Boston University Computer Science Department.
Tilley, A.R., & Associates, H.D. (2002). The measure of man and woman: human factors in design. Wiley.
Vaswani, N., Roy-Chowdhury, A.K., & Chellappa, R. (2005). Shape activity: A continuous-state hmm for moving/deforming shapes with application to abnormal activity detection. IEEE Transactions on Image Processing, 14(10), 1603–1616.
Wang, L., & Yung, N.H.C. (2010). Extraction of moving objects from their background based on multiple adaptive thresholds and boundary evaluation. IEEE Transactions on Intelligent Transportation Systems, 11(1), 40–51.
Wang, L., & Yung, N.H.C. (2011). Bayesian 3d model based human detection in crowded scenes using efficient optimization. In Proc. Appl. Comput. Vis., pages 557–563. IEEE.
Yang, Y., & Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Proc. Comput. Vis. Pattern Recog., pages 1385–1392. IEEE.
Zuffi, S., Romero, J., Schmid, C., & Black, M.J. (2013). Estimating human pose with flowing puppets. In Computer Vision (ICCV), 2013 IEEE International Conference on, pages 3312–3319. IEEE.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, C., Yung, N.H.C. Arm Poses Modeling for Pedestrians with Motion Prior. J Sign Process Syst 84, 237–249 (2016). https://doi.org/10.1007/s11265-015-1049-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-015-1049-6