Abstract
Gaze following plays a crucial role in scene comprehension tasks, as it captures users’ visual information from their facial and eye movements, thereby predicting their gaze positions. This technique finds its application in various domains such as human-computer interaction and medical diagnosis. In the domain of multi-party meeting scenes, some studies have utilized fisheye cameras to capture the entire meeting scene. In this work, we focus on gaze following methods that utilize fisheye images for meeting scenes and collect the GazeMeeting dataset that contains 31,915 fisheye samples. We also propose a dual-path feature fusing model for gaze following, which fuses the learned features in the planar and spherical domains by introducing spherical convolutions. The dual-pathway model can learn the distortion information of different positions from scene images, achieving a normalized L2 distance of 0.0657 on our self-built GazeMeeting dataset. This result represents a 22.80% improvement over the current state-of-the-art methods. Additionally, our proposed model achieves a normalized L2 distance of 0.1326 on GazeFollow dataset, outperforming the current state-of-the-art methods by 3.35%.
This work was supported by the National Key Research and Development Program of China under Grant 2021YFC3340803.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barrows, H.S.: Problem-based learning in medicine and beyond: a brief overview. New Dir. Teach. Learn. 1996(68), 3–12 (1996)
Cheng, Y., Huang, S., Wang, F., Qian, C., Lu, F.: A coarse-to-fine adaptive network for appearance-based gaze estimation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10623–10630 (2020)
Chong, E., Ruiz, N., Wang, Y., Zhang, Y., Rozga, A., Rehg, J.M.: Connecting gaze, scene, and attention: generalized attention estimation via joint modeling of gaze and scene saliency. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 383–398 (2018)
Chong, E., Wang, Y., Ruiz, N., Rehg, J.M.: Detecting attended visual targets in video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5396–5406 (2020)
Cohen, M., Shimshoni, I., Rivlin, E., Adam, A.: Detecting mutual awareness events. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2327–2340 (2012)
Cohen, T.S., Geiger, M., Köhler, J., Welling, M.: Spherical CNNs. In: International Conference on Learning Representations, pp. 1–15 (2018)
Coors, B., Condurache, A.P., Geiger, A.: Spherenet: learning spherical representations for detection and classification in omnidirectional images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 518–533 (2018)
Fan, L., Chen, Y., Wei, P., Wang, W., Zhu, S.C.: Inferring shared attention in social scene videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6460–6468 (2018)
Fathi, A., Li, Y., Rehg, J.M.: Learning to recognize daily actions using gaze. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 314–327. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_23
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Li, S., Fujii, N.: Estimating gaze points from facial landmarks by a remote spherical camera. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 7633–7639. IEEE (2021)
Li, Y., Shen, W., Gao, Z., Zhu, Y., Zhai, G., Guo, G.: Looking here or there? Gaze following in 360-degree images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3742–3751 (2021)
Lian, D., Yu, Z., Gao, S.: Believe it or not, we know what you are looking at! In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11363, pp. 35–50. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_3
Miao, J., Liu, Y., Liu, J., Argyriou, A., Xu, Z., Han, Y.: Improved face detector on fisheye images via spherical-domain attention. In: 2021 IEEE Symposium on Computers and Communications (ISCC), pp. 1–7. IEEE (2021)
Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Su, Y.C., Grauman, K.: Learning spherical convolution for fast features from 360 imagery. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Su, Y.C., Grauman, K.: Kernel transformer networks for compact spherical convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9442–9451 (2019)
Tomas, H., et al.: Goo: a dataset for gaze object prediction in retail environments. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3125–3133 (2021)
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)
Zhang, Z., Xu, Y., Yu, J., Gao, S.: Saliency detection in 360 videos. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 488–503 (2018)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886. IEEE (2012)
Zhuang, N., et al.: Muggle: multi-stream group gaze learning and estimation. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3637–3650 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rao, L., Huang, X., Cai, S., Tian, B., Xu, W., Cheng, W. (2024). A Dual-Path Approach for Gaze Following in Fisheye Meeting Scenes. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14429. Springer, Singapore. https://doi.org/10.1007/978-981-99-8469-5_16
Download citation
DOI: https://doi.org/10.1007/978-981-99-8469-5_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8468-8
Online ISBN: 978-981-99-8469-5
eBook Packages: Computer ScienceComputer Science (R0)