Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4299))

Included in the following conference series:

Abstract

This paper presents a study on the recognition of the visual focus of attention (VFOA) of meeting participants based on their head pose. Contrarily to previous studies on the topic, in our set-up, the potential VFOA of people is not restricted to other meeting participants only, but includes environmental targets (table, slide screen). This has two consequences. Firstly, this increases the number of possible ambiguities in identifying the VFOA from the head pose. Secondly, due to our particular set-up, the identification of the VFOA from head pose can not rely on an incomplete representation of the pose (the pan), but requests the knowledge of the full head pointing information (pan and tilt). In this paper, using a corpus of 8 meetings of 8 minutes on average, featuring 4 persons involved in the discussion of statements projected on a slide screen, we analyze the above issues by evaluating, through numerical performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device (the ground truth) or a vision based tracking system (head pose estimates). The results clearly show that in complex but realistic situations, it is quite optimistic to believe that the recognition of the VFOA can solely be based on the head pose, as some previous studies had suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ba, S.O., Odobez, J.-M.: A Rao-Blackwellized Mixed State Particle Filter for Head Pose Tracking. in Meetings. In: Proc. of ACM ICMI Workshop on Multi-modal Multi-party Meeting Processing (MMMP), Trento, Italy, October 7 (2005)

    Google Scholar 

  2. Brown, L., Tian, Y.: A study of Coarse Head Pose Estimation. In: Proc. of IEEE Workshop on Motion and Video Computing, Orlando, Florida (December 2002)

    Google Scholar 

  3. Danninger, M., Vertegaal, R., Siewiorek, D.P., Mamuji, A.: Using Social geometry to manage interruptions and co-worker attention in office environments. In: Proc. of Conference on Graphics Interface, Victoria, British Columbia (2005)

    Google Scholar 

  4. Duncan Jr., S.: Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology 23(2), 283–292 (1972)

    Article  Google Scholar 

  5. Gourier, N., Hall, D., CrowleyJ., L.: Estimating face orientation from robust detection of salient facial features. In: Proc. of Pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK

    Google Scholar 

  6. Heylen, D.: Challenges ahead head movements and other social acts in conversation. In: Proc. of the Joint Symposium on Virtual Social Agent (2005)

    Google Scholar 

  7. Langton, S.R.H., Watt, R.J., Bruce, V.: Do the eyes have it? Cues to the direction of social attention. Trends in Cognitive Sciences 4(2), 50–58 (2000)

    Article  Google Scholar 

  8. Langton, S.R.H.: The mutual influence of gaze and head orientation in the analysis of social attention direction. Quarterly Jl of Exp. Psychology 53A(3), 825–845 (2000)

    Article  Google Scholar 

  9. Matsumoto, Y., Ogasawara, T., Zelinsky, A.: Behavior recognition based on head pose and gaze direction measurement. In: Conf. on Intel. Robots and Sys. (2002)

    Google Scholar 

  10. MacGrath, J.E.: Groups: Interaction and performances. Prentice-Hall, Inc., Englewoods Cliffs (1984)

    Google Scholar 

  11. Novick, D., Hansen, B., Ward, K.: Coordinating turn taking with gaze. In: Proc. of International Conf. on Spoken Language Processing (October 1996)

    Google Scholar 

  12. Otsuka, K., Takemae, Y., Yamato, J., Murase, H.: A probabilistic inference of multi party-conversation structure based on Markov switching models of gaze patterns, head direction and utterance. In: Proc. of International Conf. On Multi-modal and Interfaces, Trento (2005)

    Google Scholar 

  13. Pieters, R.G.M., Rosbergen, E., Hartog, M.: Visual attention to advertising: The impact of motivation and repetition. In: Proc. of Conf. on Advances in Consumer Research (1995)

    Google Scholar 

  14. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Readings in Speech Recognition, 267–296 (1990)

    Google Scholar 

  15. Smith, P., Shah, M., da Vitoria Lobo, N.: Determining driver visuual attention with one camera. IEEE Trans. on Intel. Transportation Systems 4(4), 205–218 (2004)

    Article  Google Scholar 

  16. Smith, K., Ba, S., Gatica-Perez, D., Odobez, J.M.: Multi-Person Wandering-Focus-of-Attention Tracking. IDIAP Research Report 80 (November 2005)

    Google Scholar 

  17. Stiefelhagen, R., Yang, J., Waibel, A.: Modeling focus of attention for meeting indexing based on multiple cues. IEEE Trans. on Neural Net. 13(4) (2002)

    Google Scholar 

  18. Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I.: Modeling Individual and Group Actions in Meetings with Layered HMMs. IEEE Trans. on Multimedia (June 2006)

    Google Scholar 

  19. Flock of Birds, http://www.ascension-tech.com/products/flockofbirds.php

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ba, S.O., Odobez, JM. (2006). A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room. In: Renals, S., Bengio, S., Fiscus, J.G. (eds) Machine Learning for Multimodal Interaction. MLMI 2006. Lecture Notes in Computer Science, vol 4299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11965152_7

Download citation

  • DOI: https://doi.org/10.1007/11965152_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69267-6

  • Online ISBN: 978-3-540-69268-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics