Nothing Special   »   [go: up one dir, main page]

Skip to main content

Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels

  • Conference paper
  • First Online:
Pattern Recognition (DAGM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

Abstract

While head pose estimation has been studied for some time, continuous head pose estimation is still an open problem. Most approaches either cannot deal with the periodicity of angular data or require very fine-grained regression labels. We introduce biternion nets, a CNN-based approach that can be trained on very coarse regression labels and still estimate fully continuous \({360}^{\circ }\) head poses. We show state-of-the-art results on several publicly available datasets. Finally, we demonstrate how easy it is to record and annotate a new dataset with coarse orientation labels in order to obtain continuous head pose estimates using our biternion nets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This becomes evident by computing the derivatives of the cost w.r.t. the parameters: the tilt and roll terms are absent from the derivative w.r.t. the pan and vice-versa.

  2. 2.

    Their setup is justified for their task, but makes a fair comparison impossible.

References

  1. Aghajanian, J., Prince, S.: Face pose estimation in uncontrolled environments. In: BMVC (2009)

    Google Scholar 

  2. Ba, S.O., Odobez, J.M.: Evaluation of multiple cue head pose estimation algorithms in natural environments. In: ICME (2005)

    Google Scholar 

  3. Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)

    Google Scholar 

  4. Baxter, R.H., Leach, M.J., Mukherjee, S.S., Robertson, N.M.: An adaptive motion model for person tracking with instantaneous head-pose features. IEEE Signal Process. Lett. 22(5), 578–582 (2015)

    Article  Google Scholar 

  5. Benfold, B., Reid, I.: Unsupervised learning of a scene-specific coarse gaze estimator. In: ICCV (2011)

    Google Scholar 

  6. Black Jr., J.A., Gargesha, M., Kahol, K., Kuchi, P., Panchanathan, S.: A framework for performance evaluation of face recognition algorithms. In: Proceedings of the SPIE, vol. 4862, pp. 163–174 (2002)

    Google Scholar 

  7. Chamveha, I., Sugano, Y., Sugimura, D., Siriteerakul, T., Okabe, T., Sato, Y., Sugimoto, A.: Head direction estimation from low resolution images with scene adaptation. CVIU 117(10), 1502–1511 (2013)

    Google Scholar 

  8. Chen, C., Odobez, J.M.: We are not contortionists: coupled adaptive learning for head and body orientation estimation in surveillance video. In: CVPR (2012)

    Google Scholar 

  9. Dantone, M., Gall, J., Fanelli, G., Van Gool, L.: Real-time facial feature detection using conditional regression forests. In: CVPR (2012)

    Google Scholar 

  10. Demirkus, M., Precup, D., Clark, J.J., Arbel, T.: Probabilistic temporal head pose estimation using a hierarchical graphical model. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 328–344. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  11. Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: CVPR (2010)

    Google Scholar 

  12. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3D face analysis. IJCV 101(3), 437–458 (2013)

    Article  Google Scholar 

  13. Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML (2013)

    Google Scholar 

  14. Gourier, N., Hall, D., Crowley, J.L.: Estimating Face orientation from robust detection of salient facial structures. In: ICPR 2004 FG Net Workshop (2004)

    Google Scholar 

  15. Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vis. Comput. 28(5), 807–813 (2010)

    Article  Google Scholar 

  16. Hara, K., Chellappa, R.: Growing regression forests by classification: applications to object pose estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 552–567. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification (2015). arXiv preprint arXiv:1502.01852

  18. He, K., Sigal, L., Sclaroff, S.: Parameterizing object detectors in the continuous pose space. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 450–465. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  19. Huang, D., Storer, M., De la Torre, F., Bischof, H.: Supervised local subspace learning for continuous head pose estimation. In: CVPR (2011)

    Google Scholar 

  20. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv preprint arXiv:1502.03167

  21. Lallemand, J., Ronge, A., Szczot, M., Ilic, S.: Pedestrian orientation estimation. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 476–487. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  22. Mardia, K.V., Jupp, P.E.: Directional Statistics, vol. 494. Wiley, New york (2009)

    Google Scholar 

  23. Montavon, G., Orr, G.B., Müller, K. (eds.): Neural Networks: Tricks of the Trade, 2nd edn. Springer, Berlin (2012)

    Google Scholar 

  24. Murphy-Chutorian, E., Doshi, A., Trivedi, M.M.: Head pose estimation for driver assistance systems: a robust algorithm and experimental evaluation. In: ITSC (2007)

    Google Scholar 

  25. Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: a survey. PAMI 31(4), 607–626 (2009)

    Article  Google Scholar 

  26. Osadchy, M., Cun, Y.L., Miller, M.L.: Synergistic face detection and pose estimation with energy-based models. JMLR 8, 1197–1215 (2007)

    Google Scholar 

  27. Pérez, F., Granger, B.E.: IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9(3), 21–29 (2007). http://ipython.org

    Article  Google Scholar 

  28. Qi, R.: Learning 3D Object Orientations From Synthetic Images (2015)

    Google Scholar 

  29. Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. In: ICLR (2014)

    Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  31. Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 24(4), 35–43 (2001)

    Google Scholar 

  32. Siriteerakul, T.: Advance in head pose estimation from low resolution images: a review. IJCSI 9(2) (2012)

    Google Scholar 

  33. Torki, M., Elgammal, A.: Regression from local features for viewpoint and pose estimation. In: ICCV (2011)

    Google Scholar 

  34. Tosato, D., Spera, M., Cristani, M., Murino, V.: Characterizing humans on riemannian manifolds. PAMI 35(8), 1972–1984 (2013)

    Article  Google Scholar 

  35. Wu, Y., Toyama, K.: Wide-range, person- and illumination-insensitive head orientation estimation. In: International Conference on Automatic Face and Gesture Recognition (2000)

    Google Scholar 

  36. Zeiler, M.D., Rob, F.: Stochastic pooling for regularization of deep convolutional neural networks. In: ICLR (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Beyer .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 1526 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Beyer, L., Hermans, A., Leibe, B. (2015). Biternion Nets: Continuous Head Pose Regression from Discrete Training Labels. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24947-6_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24946-9

  • Online ISBN: 978-3-319-24947-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics