Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3666122.3668838guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
research-article

EV-Eye: rethinking high-frequency eye tracking through the lenses of event cameras

Published: 30 May 2024 Publication History

Abstract

In this paper, we present EV-Eye, a first-of-its-kind large-scale multimodal eye tracking dataset aimed at inspiring research on high-frequency eye/gaze tracking. EV-Eye utilizes the emerging bio-inspired event camera to capture independent pixel-level intensity changes induced by eye movements, achieving sub-microsecond latency. Our dataset was curated over two weeks and collected from 48 participants encompassing diverse genders and age groups. It comprises over 1.5 million near-eye grayscale images and 2.7 billion event samples generated by two DAVIS346 event cameras. Additionally, the dataset contains 675 thousand scene images and 2.7 million gaze references captured by a Tobii Pro Glasses 3 eye tracker for cross-modality validation. Compared with existing event-based high-frequency eye tracking datasets, our dataset is significantly larger in size, and the gaze references involve more natural and diverse eye movement patterns, i.e., fixation, saccade, and smooth pursuit. Alongside the event data, we also present a hybrid eye tracking method as a benchmark, which leverages both the near-eye grayscale images and event data for robust and high-frequency eye tracking. We show that our method achieves higher accuracy for both pupil and gaze estimation tasks compared to the existing solution.

References

[1]
Eye tracking. https://en.wikipedia.org/wiki/Eye_tracking.
[2]
Maria K Eckstein, Belén Guerra-Carrillo, Alison T Miller Singley, and Silvia A Bunge. Beyond eye gaze: What else can eyetracking reveal about cognition and cognitive development? Developmental Cognitive Neuroscience, 25:69-91, 2017.
[3]
Andrew T Duchowski. A breadth-first survey of eye-tracking applications. Behavior Research Methods, Instruments, & Computers, 34(4):455-470, 2002.
[4]
Yihua Cheng, Haofei Wang, Yiwei Bao, and Feng Lu. Appearance-based gaze estimation with deep learning: A review and benchmark. arXiv preprint arXiv:2104.12668, 2021.
[5]
Carlos H. Morimoto and Marcio R. M. Mimica. Eye gaze tracking techniques for interactive applications. Comput. Vis. Image Underst., 98(1):4-24, apr 2005.
[6]
Nachiappan Valliappan, Na Dai, Ethan Steinberg, Junfeng He, Kantwon Rogers, Venky Ramachandran, Pingmei Xu, Mina Shojaeizadeh, Li Guo, Kai Kohlhoff, et al. Accelerating eye movement research via accurate and affordable smartphone eye tracking. Nature Communications, 11(1):1-12, 2020.
[7]
Xinming Wang, Jianhua Zhang, Hanlin Zhang, Shuwen Zhao, and Honghai Liu. Vision-based gaze estimation: a review. IEEE Transactions on Cognitive and Developmental Systems, 2021.
[8]
Joohwan Kim, Michael Stengel, Alexander Majercik, Shalini De Mello, David Dunn, Samuli Laine, Morgan McGuire, and David Luebke. Nvgaze: An anatomically-informed dataset for low-latency, near-eye gaze estimation. In Proceedings of the 2019 CHI Conference, New York, NY, USA, 2019. Association for Computing Machinery.
[9]
Tobii pro 3 glasses. https://www.tobii.com/products/eye-trackers/wearables/tobii-pro-glasses-3.
[10]
Pupil Labs eye tracker. https://pupil-labs.com/.
[11]
Tiffany CK Kwok, Peter Kiefer, Victor R Schinazi, Benjamin Adams, and Martin Raubal. Gaze-guided narratives: Adapting audio guide content to gaze in virtual and real environments. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1-12, 2019.
[12]
Alexander Mariakakis, Mayank Goel, Md Tanvir Islam Aumi, Shwetak N Patel, and Jacob O Wobbrock. SwitchBack: Using focus and saccade tracking to guide users' attention for mobile task resumption. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pages 2953-2962, 2015.
[13]
Guohao Lan, Bailey Heit, Tim Scargill, and Maria Gorlatova. GazeGraph: Graph-based few-shot cognitive context sensing from human visual behavior. In Proceedings of the ACM Conference on Embedded Networked Sensor Systems (SenSys), pages 422-435, 2020.
[14]
Namrata Srivastava, Joshua Newn, and Eduardo Velloso. Combining low and mid-level gaze features for desktop activity recognition. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(4):1-27, 2018.
[15]
Zillah Boraston and Sarah-Jayne Blakemore. The application of eye-tracking technology in the study of autism. The Journal of Physiology, 581(3):893-898, 2007.
[16]
Dillon Lohr, Henry Griffith, and Oleg V Komogortsev. Eye know you: Metric learning for end-to-end biometric authentication using eye movements from a longitudinal dataset. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2022.
[17]
Dillon Lohr and Oleg V Komogortsev. Eye know you too: Toward viable end-to-end eye movement biometrics for user authentication. IEEE Transactions on Information Forensics and Security, 17:3151-3164, 2022.
[18]
Wikipedia saccade. https://en.wikipedia.org/wiki/Saccade.
[19]
Richard A Abrams, David E Meyer, and Sylvan Kornblum. Speed and accuracy of saccadic eye movements: characteristics of impulse variability in the oculomotor system. Journal of Experimental Psychology: Human Perception and Performance, 15(3):529, 1989.
[20]
Nanda N. J. Rommelse, Stefan Van der Stigchel, and Joseph A. Sergeant. A review on eye movement studies in childhood and adolescent psychiatry. Brain and Cognition, 68:391-414, 2008.
[21]
A highly accurate, precise, and versatile eye tracker. https://www.sr-research.com/eyelink-1000-plus/.
[22]
Guillermo Gallego, Tobi Delbrück, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew J. Davison, Jörg Conradt, Kostas Daniilidis, and Davide Scaramuzza. Event-based vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):154-180, 2022.
[23]
Christoph Posch, Daniel Matolin, and Rainer Wohlgenannt. A qvga 143 db dynamic range frame-free pwm image sensor with lossless pixel-level video compression and time-domain cds. IEEE Journal of Solid-State Circuits, 46(1):259-275, 2011.
[24]
Timo Stoffregen, Hossein Daraei, Clare Robinson, and Alexander Fix. Event-based kilohertz eye tracking using coded differential lighting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2515-2523, 2022.
[25]
Anastasios N Angelopoulos, Julien NP Martel, Amit P Kohli, Jörg Conradt, and Gordon Wetzstein. Event-based near-eye gaze tracking beyond 10,000 Hz. IEEE Transactions on Visualization and Computer Graphics, 27(5):2577-2586, 2021.
[26]
Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. Appearance-based gaze estimation in the wild. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4511-4520, 2015.
[27]
Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang, and Otmar Hilliges. ETH-XGaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. In Proceedings of European Conference on Computer Vision (ECCV), pages 365-381. Springer, 2020.
[28]
Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhandarkar, Wojciech Matusik, and Antonio Torralba. Eye tracking for everyone. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2176-2184, 2016.
[29]
Jiang Zhao, Shilong Ji, Zhihao Cai, Yiwen Zeng, and Yingxun Wang. Moving object detection and tracking by event frame from neuromorphic vision sensors. Biomimetics, 7(1):31, 2022.
[30]
Yanxiang Wang, Bowen Du, Yiran Shen, Kai Wu, Guangrong Zhao, Jianguo Sun, and Hongkai Wen. Ev-gait: Event-based robust gait recognition using dynamic vision sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6358-6367, 2019.
[31]
Patrick Bardow, Andrew J. Davison, and Stefan Leutenegger. Simultaneous optical flow and intensity estimation from an event camera. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 884-892, 2016.
[32]
Yin Bi, Aaron Chadha, Alhabib Abbas, Eirina Bourtsoulatze, and Yiannis Andreopoulos. Graph-based spatio-temporal feature learning for neuromorphic vision sensing. IEEE Transactions on Image Processing, 29:9084-9098, 2020.
[33]
Yanxiang Wang, Xian Zhang, Yiran Shen, Bowen Du, Guangrong Zhao, Lizhen Cui, and Hongkai Wen. Event-stream representation for human gaits identification using deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
[34]
Qinyi Wang, Yexin Zhang, Junsong Yuan, and Yilong Lu. Space-time event clouds for gesture recognition: From rgb cameras to event cameras. In Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1826-1835. IEEE, 2019.
[35]
Alex Zihao Zhu, Liangzhe Yuan, Kenneth Chaney, and Kostas Daniilidis. Unsupervised event-based optical flow using motion compensation. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pages 0-0, 2018.
[36]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 652-660, 2017.
[37]
S. Tulyakov, D. Gehrig, S. Georgoulis, J. Erbach, M. Gehrig, Y. Li, and D. Scaramuzza. Time lens: Event-based video frame interpolation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16150-16159, 2021.
[38]
Kang Wang and Qiang Ji. Real time eye gaze tracking with 3d deformable eye-face model. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pages 1003-1011, 2017.
[39]
E.D. Guestrin and M. Eizenman. General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Transactions on Biomedical Engineering, 53(6):1124-1133, 2006.
[40]
Qiong Huang, Ashok Veeraraghavan, and Ashutosh Sabharwal. Tabletgaze: Dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets. Mach. Vision Appl., 28(5-6):445-461, aug 2017.
[41]
Kang Wang, Hui Su, and Qiang Ji. Neuro-inspired eye tracking with eye movement dynamics. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9823-9832, 2019.
[42]
Kenneth Alberto Funes Mora, Florent Monay, and Jean-Marc Odobez. Eyediap: A database for the development and evaluation of gaze estimation algorithms from rgb and rgb-d cameras. In Proceedings of the Symposium on Eye Tracking Research and Applications, ETRA '14, page 255-258, New York, NY, USA, 2014. Association for Computing Machinery.
[43]
Seonwook Park, Emre Aksan, Xucong Zhang, and Otmar Hilliges. Towards end-to-end video-based eye-tracking. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XII, page 747-763, Berlin, Heidelberg, 2020. Springer-Verlag.
[44]
Y. Feng, N. Goulding-Hotta, A. Khan, H. Reyserhove, and Y. Zhu. Real-time gaze tracking with event-driven eye segmentation. In 2022 IEEE on Conference Virtual Reality and 3D User Interfaces (VR), pages 399-408, Los Alamitos, CA, USA, mar 2022. IEEE Computer Society.
[45]
Cian Ryan, Brian O'Sullivan, Amr Elrasad, Aisling Cahill, Joe Lemley, Paul Kielty, Christoph Posch, and Etienne Perot. Real-time face & eye tracking and blink detection using event cameras. Neural Networks, 141:87-97, 2021.
[46]
Z.R. Cherif, A. Nait-Ali, J.F. Motsch, and M.O. Krebs. An adaptive calibration of an infrared light device used for gaze tracking. In Proceedings of the 19th IEEE Instrumentation and Measurement Technology Conference, volume 2, pages 1029-1033 vol.2, 2002.
[47]
Davis346 event camera. https://inivation.com/wp-content/uploads/2019/08/DAVIS346.pdf.
[48]
Vgg image annotator. http://www.robots.ox.ac.uk/vgg/software/via.
[49]
Parameters of ellipse traced out by tip of a polarized field vector. https://www.mathworks.com/help/phased/ref/polellip.html.
[50]
Points located inside or on edge of polygonal region. https://ch.mathworks.com/help/matlab/ref/inpolygon.html.
[51]
Xucong Zhang, Seonwook Park, Thabo Beeler, Derek Bradley, Siyu Tang, and Otmar Hilliges. Eth-xgaze: A large scale dataset for gaze estimation under extreme head pose and gaze variation. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision - ECCV 2020, pages 365-381, Cham, 2020. Springer International Publishing.
[52]
X. Zhang, Y. Sugano, M. Fritz, and A. Bulling. Mpiigaze: Real-world dataset and deep appearance-based gaze estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(01):162-175, jan 2019.
[53]
tobiipro. Pro glasses 3 developer guide. https://www.tobii.com/products/eye-trackers/wearables/tobii-proglasses-3#form, 2022.
[54]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015, pages 234-241, 2015.
[55]
Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, and Jeff B. Pelz. RITnet: Real-time semantic segmentation of the eye for gaze tracking. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, oct 2019.
[56]
Chengdong Lin, Xinlin Li, Zhenjiang Li, and Junhui Hou. Finding stars from fireworks: Improving non-cooperative iris tracking. IEEE Transactions on Circuits and Systems for Video Technology, 32(9):6137-6147, 2022.
[57]
Caiyong Wang, Jawad Muhammad, Yunlong Wang, Zhaofeng He, and Zhenan Sun. Towards complete and accurate iris segmentation using deep multi-task attention network for non-cooperative iris recognition. IEEE Transactions on Information Forensics and Security, 15:29442959, 2020.
[58]
Sheng Lian, Zhiming Luo, Zhun Zhong, Xiang Lin, Songzhi Su, and Shaozi Li. Attention guided u-net for accurate iris segmentation. Journal of Visual Communication and Image Representation, 56:296-304, 2018.
[59]
Luc Vincent. Morphological area openings and closings for grey-scale images. In Ying-Lie O, Alexander Toet, David Foster, Henk J. A. M. Heijmans, and Peter Meer, editors, Shape in Picture, pages 197-208, 1994.
[60]
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. 2015.
[61]
Manzil Zaheer, Sashank Reddi, Devendra Sachan, Satyen Kale, and Sanjiv Kumar. Adaptive methods for nonconvex optimization. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
[62]
Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 658-666, 2019.
[63]
Stephan J. Garbin, Yiru Shen, Immo Schuetz, Robert Cavin, Gregory Hughes, and Sachin S. Talathi. Openeds: Open eye dataset, 2019.
[64]
Wolfgang Fuhl, Thiago C Santini, Thomas Kübler, and Enkelejda Kasneci. Else: Ellipse selection for robust pupil detection in real-world environments. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, pages 123-130, 2016.
[65]
Wolfgang Fuhl, Thiago Santini, Gjergji Kasneci, Wolfgang Rosenstiel, and Enkelejda Kasneci. Pupilnet v2.0: Convolutional neural networks for cpu based real time robust pupil detection. ArXiv, abs/1711.00112, 2017.
[66]
Murthy L R D and Pradipta Biswas. Appearance-based gaze estimation using attention and difference mechanism. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 3137-3146, 2021.
[67]
Dodou D. de Winter J.C.F. Onkhar, V. Evaluating the tobii pro glasses 2 and 3 in static and dynamic conditions. Behav Res, 2023.
[68]
Iqbal S. Pearson J. Johnson E. N. MacInnes, J. J. Wearable eye-tracking for research: Automated dynamic gaze mapping and accuracy/precision comparisons across devices. BioRxiv, 2018.
[69]
Petr Kellnhofer, Adria Recasens, Simon Stent, Wojciech Matusik, and Antonio Torralba. Gaze360: Physically unconstrained gaze estimation in the wild. In IEEE International Conference on Computer Vision (ICCV), October 2019.
[70]
Shruti Jadon. A survey of loss functions for semantic segmentation. In 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), pages 1-7, 2020.
[71]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. Datasheets for datasets. Communications of the ACM, 64(12):86-92, 2021.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems
December 2023
80772 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 30 May 2024

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media