Abstract
Most e-learning environments which utilize user feedback or profiles, collect such information based on questionnaires, resulting very often in incomplete answers, and sometimes deliberate misleading input. In this work, we present a mechanism which compiles feedback related to the behavioral state of the user (e.g. level of interest) in the context of reading an electronic document; this is achieved using a non-intrusive scheme, which uses a simple web camera to detect and track the head, eye and hand movements and provides an estimation of the level of interest and engagement with the use of a neuro-fuzzy network initialized from evidence from the idea of Theory of Mind and trained from expert-annotated data. The user does not need to interact with the proposed system, and can act as if she was not monitored at all. The proposed scheme is tested in an e-learning environment, in order to adapt the presentation of the content to the user profile and current behavioral state. Experiments show that the proposed system detects reading- and attention-related user states very effectively, in a testbed where children’s reading performance is tracked.
Similar content being viewed by others
References
Asteriadis S, Nikolaidis N, Pitas I, Pardas M (2007) Detection of facial characteristics based on edge information, In: Proceedings of the Second International Conference on Computer Vision Theory and Applications (VISAPP), Barcelona, Spain, vol. 2, pp 247–252
Ba SO, Odobez JM (2006) A study on visual focus of attention recognition from head pose in a meeting room. In: Third Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI06), Washington, USA, pp 1–3
Baron-Cohen S (1995) Mindblindness. MIT, Cambridge
Beymer D, Flickner M (2003) Eye gaze tracking using an active stereo head. In: Proc. Of IEEE CVPR, Madison, WI, vol. 2, pp 451–458
Bosse T, Memon ZA, Treur J (2007) A two-level BDI-agent model for theory of mind and its use in social manipulation. In: Proceedings of the AISB 2007 Workshop on Mindful Environments, pp 335–342
Bouguet JY (2000) Pyramidal implementation of the Lucas Kanade tracker. OpenCV Documentation
Caridakis G, Karpouzis K, Kollias S (2008) User and context adaptive neural networks for emotion recognition. Neurocomputing 71:2553–2562 available online 9 May 2008
Chiu S (1994) Fuzzy model identification based on cluster estimation. J Intell Fuzzy Syst 2(3):267–278
Christie J, Johnsen E (1983) The role of play in social–intellectual development. R Educ Res 53(1):93–115
Commission of European Communities (2000) Communication from the Commission: e-learning—designing tomorrow’s education. Commission of European Communities, Brussels
Cristinacce D, Cootes T, Scott I (2004) A multi-stage approach to facial feature detection. In: Proceedings of the 15th British Machine Vision Conference, London, UK, pp 277–286
D’ Orazio T, Leo M, Cicirelli G, Distante A (2004) An algorithm for real time eye detection in face images. Pattern Recogn 3:278–281
D’ Orazio T, Leo M, Guaragnella C, Distante A (2007) A visual approach for driver inattention detection. Pattern Recogn 40(8):2341–2355
Daugman JG (1993) High confidence visual recognition of persons by a test of statistical independence. IEEE Trans Pattern Anal Mach Intell 15:1148–1161
Deng JY, Lai F (1997) Region-based template deformation and masking for eye-feature extraction and description. Pattern Recogn 30(3):403–419
Duchowski AT (2002) A breadth-first survey of eye tracking applications. Behav Res Meth Instrum Comput 34(4):455–470
FP6 STREP (2007) Agent Dysl project. http://www.agent-dysl.eu. Accessed 10 August 2008
Gärdenfors P (2001) Slicing the theory of mind. In: Collin F (ed) Danish yearbook for philosophy. vol. 36. Museum Tusculanum Press, Copenhagen, pp 7–34
Gee AH, Cipolla R (1994) Non-intrusive gaze tracking for human–computer interaction. In: Proceedings of the International Conference on Mechatronics and Machine Vision in Practice Proceedings, Toowoomba, Australia, pp 112–117
Gourier N, Hall D, Crowley J (2004) Estimating face orientation using robust detection of salient facial features. In: Proceedings of Pointing, ICPR, International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK
Hennessey C, Noureddin B, Lawrence P (2006) A single camera eye-gaze tracking system with free head motion. In: Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA ‘05), San Diego, CA, USA, pp 87–94
Huang KS, Trivedi MM (2004) Robust real-time detection, tracking, and pose estimation of faces in video. In: Proceedings of the International Conference on Pattern Recognition (ICPR), Cambridge, UK, vol. 3, pp 965–968
Ioannou S, Caridakis G, Karpouzis K, Kollias S (2007) Robust feature detection for facial expression recognition. Int J Image Video Process 29081
Jang J-SR (1993) ANFIS: adaptive-network-based fuzzy inference systems. IEEE Trans Syst Man Cybernetics 23(3):665–685
Jesorsky O, Kirchberg KJ, Frischholz RW (2001) Robust face detection using the Hausdorff distance. In: Proceedings of the Third International Conference on Audio and Video-based Biometric Person Authentication (AVBPA), pp 90–95
Karagiannidis C, Sampson DG, Cardinali F (2002) An architecture for web-based e-learning promoting reusable adaptive educational e-content. Educ Technol Soc 5(4):27–37
Khan MM, Ward RD, Ingleby M (2006) Automated facial expression classification and affect interpretation using infrared measurement of facial skin temperature. ACM Trans Auton Adaptive Syst 1(1):1–113
Lillard A (1993) Pretend play skills and the child’s theory of mind. Child Dev 64(2):348–371
Marsella SC, Pynadath DV, Read SJ (2004) PsychSim: agent-based modeling of social interaction and influence. In: Lovett M, et al. (eds) Proceedings of ICCM’04. Pittsburg, Pennsylvania, USA, pp 243–248
Martin J-C, Caridakis G, Devillers L, Karpouzis L, Abrilian S (2007) Manual annotation and automatic image processing of multimodal emotional behaviors: validating the annotation of TV interviews. Personal and ubiquitous computing (Special issue on Emerging Multimodal Interfaces). Springer, Heidelberg
Matumoto Y, Ogasawara T, Zelinsky A (2002) Behavior recognition based on head pose and gaze direction measurement. In: Proceedings of 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, pp 2127–2132
Meyer A, Böhme M, Martinetz T, Barth E (2006) A single-camera remote eye tracker. In: Andre E (ed) Perception and interactive technologies (Lecture notes in artificial intelligence). vol. 4021. Springer, Heidelberg, pp 208–211
Mitrakis N, Theocharis J, Petridis V (2008) A multilayered neuro-fuzzy classifier with self-organizing properties, fuzzy sets and systems. doi:10.1016/j.fss.2008.01.032
Ong S, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Pattern Anal Mach Intell 27(6):873–891
Otsuka K, Takemae Y, Yamato J, Murase H (2005) A probabilistic inference of multiparty-conversation structure based on Markov switching models of gaze patterns, head direction and utterance. In: Proceedings of International Conf. On Multi-modal and Interfaces, Trento
Pantic M, Patras I (2006) Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences. IEEE Trans Syst Man Cybern B 36(2):433–449
Schneiderman H, Kanade T (2000) A statistical model for 3D object detection applied to faces and cars. IEEE Comput Soc Conf Vis Pattern Recogn 1:746–751
Seo K, Cohen I, You S, Neumann U (2004) Face pose estimation system by combining hybrid ICA-SVM learning and re-registration, In: Proceedings of the 5th Asian Conference on Computer Vision, Jeju, Korea
Smith P, Shah M, da Vitoria Lobo N (2003) Determining driver visual attention with one camera. IEEE Trans Intell Transportation Syst 4(4):205–218
Stiefelhagen R (2004) Estimating head pose with neural networks—results on the pointing. In: 04 ICPR Workshop Evaluation Data, Proceedings of Pointing, ICPR, International Workshop on Visual Observation of Deictic Gestures, Cambridge, UK
Stiefelhagen R, Yang J, Waibel A (2001) Estimating focus of attention based on gaze and sound. In: Proceedings of the Workshop on Perceptive User Interfaces, Orlando, Florida
Sylva K, Runer JS, Genova P (1976) The role of play in the problem-solving of children 3–5 years old. In: Bruner J, Jolly A, Sylva K (eds) PlayΡIts role in development and evolution. Basic Books, New York
Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modelling and control. IEEE Trans Syst Man Cybern 15(1):116–132
Tzouveli P, Mitropoulou E, Ntalianis K, Kollias S, Symvonis A (2007) Design of an accommodative intelligent educational environments for dyslexic learners. In: Proceedings of the 11th Conference on Learning Difficulties in the Framework of School Education, Athens, Greece
Tzouveli P, Mylonas P, Kollias S (2008) An intelligent e-learning system based on learner profiling and learning resources adaptation. Comput Educ 51(1):224–238
Tzouveli P, Schmidt A, Schneider M, Symvonis A, Kollias S (2008) Adaptive reading assistance for the inclusion of students with dyslexia: the AGENT-DYSL approach. In: Proceedings of the 8th IEEE International Conference on Advanced Learning Technologies (ICALT 2008), Santander, Cantabria, Spain
Viola P, Jones M (2004) Robust real-time face detection. Comput Vis 57(2):137–154
Voit M, Nickel K, Stiefelhagen R (2005) Multi-view head pose estimation using neural networks. In: Proc of the Computer and Robot Vision (CRV’05), Victoria, BC, Canada,347–352
Ward RD (2004) An analysis of facial movement tracking in ordinary human-computer interaction. Interacting with Computers 16(5):879–896
Wu Y, Huang T (2001) Hand modeling, analysis, and recognition for vision-based human computer interaction. IEEE Signal Proc 18:51–60
Yang MH, Kriegman DJ, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1):34–58
Yuxing M, Ching Y, Suen CS, Chunhua F (2007) Pose estimation based on two images from different views. In: Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV’07), Austin, Texas, USA, 9–16
Zhou ZH, Geng X (2004) Projection functions for eye detection. Pattern Recogn 37(5):1049–1056
Acknowledgement
This work has been funded by the FP6 IP Callas (Conveying Affectiveness in Leading-edge Living Adaptive Systems), Contract Number IST-34800 and the FP6 STREP Agent-Dysl (Accommodative Intelligent Educational Environments for Dyslexic learners) Contract Number IST-034549.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Asteriadis, S., Tzouveli, P., Karpouzis, K. et al. Estimation of behavioral user state based on eye gaze and head pose—application in an e-learning environment. Multimed Tools Appl 41, 469–493 (2009). https://doi.org/10.1007/s11042-008-0240-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-008-0240-1