Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems

Published: 17 May 2024 Publication History

Abstract

Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate. Segmentation models trained using supervised machine learning can excel at this task, their effectiveness is determined by the degree of overlap between the narrow distributions of image properties defined by the target dataset and highly specific training datasets, of which there are few. Attempts to broaden the distribution of existing eye image datasets through the inclusion of synthetic eye images have found that a model trained on synthetic images will often fail to generalize back to real-world eye images. In remedy, we use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap. We demonstrate that our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.

References

[1]
John Canny. 1986. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-8, 6 (1986), 679--698. https://doi.org/10.1109/TPAMI.1986.4767851
[2]
Watchanan Chantapakul, Linda Hansapinyo, and Karn Patanukhom. 2019. Eye Semantic Segmentation Using Ensemble of Deep Convolutional Neural Networks. Proceedings of the 2019 2nd Artificial Intelligence and Cloud Computing Conference (2019). https://api.semanticscholar.org/CorpusID:211520180
[3]
Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, and Jeff B. Pelz. 2019. RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE. https://doi.org/10.1109/iccvw.2019.00568
[4]
Chen Chen, Qifeng Chen, Minh Do, and Vladlen Koltun. 2019. Seeing Motion in the Dark. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 3184--3193. https://doi.org/10.1109/ICCV.2019.00328
[5]
J.G. Daugman. 1993. High confidence visual recognition of persons by a test of statistical independence. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 11 (1993), 1148--1161. https://doi.org/10.1109/34.244676
[6]
Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, and Jianbin Jiao. 2017. Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification. https://doi.org/10.48550/ARXIV.1711.07027
[7]
Xingping Dong and Jianbing Shen. 2018. Triplet Loss in Siamese Network for Object Tracking. In Computer Vision - ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 472--488.
[8]
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2017. Domain-Adversarial Training of Neural Networks. Springer International Publishing, Cham, 189--209. https://doi.org/10.1007/978-3-319-58347-1_10
[9]
Stephan J. Garbin, Yiru Shen, Immo Schuetz, Robert Cavin, Gregory Hughes, and Sachin S. Talathi. 2019. OpenEDS: Open Eye Dataset. arXiv:1905.03702 [cs.CV]
[10]
Shreya Ghosh, Abhinav Dhall, Munawar Hayat, Jarrod Knibbe, and Qian Ji. 2021. Automatic Gaze Analysis: A Survey of Deep Learning based Approaches. ArXiv abs/2108.05479 (2021). https://api.semanticscholar.org/CorpusID:236987204
[11]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014a. Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.1406.2661
[12]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014b. Explaining and Harnessing Adversarial Examples. https://doi.org/10.48550/ARXIV.1412.6572
[13]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A Kernel Two-Sample Test. Journal of Machine Learning Research 13, 25 (2012), 723--773. http://jmlr.org/papers/v13/gretton12a.html
[14]
R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality Reduction by Learning an Invariant Mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 2. 1735--1742. https://doi.org/10.1109/CVPR.2006.100
[15]
Simon Haykin. 1998. Neural Networks: A Comprehensive Foundation (2nd ed.). Prentice Hall PTR, USA.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778. https://doi.org/10.1109/CVPR.2016.90
[17]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6629--6640.
[18]
N. Kanopoulos, N. Vasanthavada, and R.L. Baker. 1988. Design of an image edge detection filter using the Sobel operator. IEEE Journal of Solid-State Circuits 23, 2 (1988), 358--367. https://doi.org/10.1109/4.996
[19]
Manuel Kaspar, Juan D. Muñoz Osorio, and Juergen Bock. 2020. Sim2Real Transfer for Reinforcement Learning without Dynamics Randomization. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 4383--4388. https://doi.org/10.1109/IROS45743.2020.9341260
[20]
Hoel Kervadec, Jihene Bouchtiba, Christian Desrosiers, Eric Granger, Jose Dolz, and Ismail Ben Ayed. 2021. Boundary loss for highly unbalanced segmentation. Medical Image Analysis 67 (jan 2021), 101851. https://doi.org/10.1016/j.media.2020.101851
[21]
Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese Neural Networks for One-shot Image Recognition.
[22]
Sven Kosub. 2019. A note on the triangle inequality for the Jaccard distance. Pattern Recognition Letters 120 (2019), 36--38. https://doi.org/10.1016/j.patrec.2018.12.007
[23]
Rakshit S. Kothari, Reynold J. Bailey, Christopher Kanan, Jeff B. Pelz, and Gabriel J. Diaz. 2022. EllSeg-Gen, towards Domain Generalization for Head-Mounted Eyetracking. Proceedings of the ACM on Human-Computer Interaction 6, ETRA (may 2022), 1--17. https://doi.org/10.1145/3530880
[24]
Rakshit S. Kothari, Aayush K. Chaudhary, Reynold J. Bailey, Jeff B. Pelz, and Gabriel J. Diaz. 2021. EllSeg: An Ellipse Segmentation Framework for Robust Gaze Tracking. IEEE Transactions on Visualization and Computer Graphics 27, 5 (may 2021), 2757--2767. https://doi.org/10.1109/tvcg.2021.3067765
[25]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. Neural Information Processing Systems 25 (01 2012). https://doi.org/10.1145/3065386
[26]
Yann LeCun, Yoshua Bengio, et al. 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10 (1995), 1995.
[27]
Yann LeCun, Y. Bengio, and Geoffrey Hinton. 2015. Deep Learning. Nature 521 (05 2015), 436--44. https://doi.org/10.1038/nature14539
[28]
Sizhe Liu. 2022. Study for Identity Losses in Image-to-Image Domain Translation with Cycle-Consistent Generative Adversarial Network. Journal of Physics: Conference Series 2400, 1 (dec 2022), 012030. https://doi.org/10.1088/1742-6596/2400/1/012030
[29]
Conny Lu, Qian Zhang, Kapil Krishnakumar, Jixu Chen, Henry Fuchs, Sachin Talathi, and Kun Liu. 2022. Geometry-Aware Eye Image-To-Image Translation. In 2022 Symposium on Eye Tracking Research and Applications (Seattle, WA, USA) (ETRA '22). Association for Computing Machinery, New York, NY, USA, Article 69, 7 pages. https://doi.org/10.1145/3517031.3532524
[30]
Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, and Olivier Bousquet. 2017. Are GANs Created Equal? A Large-Scale Study. https://doi.org/10.48550/ARXIV.1711.10337
[31]
Iaroslav Melekhov, Juho Kannala, and Esa Rahtu. 2016. Siamese network features for image matching. 2016 23rd International Conference on Pattern Recognition (ICPR) (2016), 378--383. https://api.semanticscholar.org/CorpusID:9740232
[32]
Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. 2016. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In 2016 Fourth International Conference on 3D Vision (3DV). 565--571. https://doi.org/10.1109/3DV.2016.79
[33]
Shervin Minaee, Yuri Boykov, Fatih Murat Porikli, Antonio J. Plaza, Nasser Kehtarnavaz, and Demetri Terzopoulos. 2020. Image Segmentation Using Deep Learning: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (2020), 3523--3542. https://api.semanticscholar.org/CorpusID:210702798
[34]
Wan Azani Mustafa and Mohamed Mydin M. Abdul Kader. 2018. A Review of Histogram Equalization Techniques in Image Enhancement Application. Journal of Physics: Conference Series 1019, 1 (jun 2018), 012026. https://doi.org/10.1088/1742-6596/1019/1/012026
[35]
Nitinraj Nair, Rakshit Kothari, Aayush K. Chaudhary, Zhizhuo Yang, Gabriel J. Diaz, Jeff B. Pelz, and Reynold J. Bailey. 2020. RIT-Eyes: Rendering of near-eye images for eye-tracking applications. In ACM Symposium on Applied Perception 2020 (Virtual Event, USA) (SAP '20). Association for Computing Machinery, New York, NY, USA, Article 5, 9 pages. https://doi.org/10.1145/3385955.3407935
[36]
Loris Nanni, Giovanni Minchio, Sheryl Brahnam, Davide Sarraggiotto, and Alessandra Lumini. 2021. Closing the Performance Gap between Siamese Networks for Dissimilarity Image Classification and Convolutional Neural Networks. Sensors (Basel, Switzerland) 21 (08 2021). https://doi.org/10.3390/s21175809
[37]
Jeff Pelz and Dan Witzner Hansen. 2017. System and Method for Eye Tracking. Pub. No.: WO/2017/205789 International Application No.: PCT/US2017/034756 Publication Date: 30.11.2017 International Filing Date: 26.05.2017; 2017/034756; G06K 9/00 (2006.01), G06K 9/62 (2006.01), G06K 9/46 (2006.01).
[38]
Jonathan Perry and Amanda Fernandez. 2019. MinENet: A Dilated CNN for Semantic Segmentation of Eye Features. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). 3671--3676. https://doi.org/10.1109/ICCVW.2019.00453
[39]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015, Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi (Eds.). Springer International Publishing, Cham, 234--241.
[40]
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS'16). Curran Associates Inc., Red Hook, NY, USA, 2234--2242.
[41]
Samir Shah and Arun Ross. 2009. Iris Segmentation Using Geodesic Active Contours. IEEE Transactions on Information Forensics and Security 4, 4 (2009), 824--836. https://doi.org/10.1109/TIFS.2009.2033225
[42]
Joseph Stember, H Celik, E Krupinski, P Chang, S Mutasa, Bradford Wood, A Lignelli, G Moonis, L Schwartz, and Sachin Jambawalikar. 2019. Eye Tracking for Deep Learning Segmentation Using Convolutional Neural Networks. Journal of Digital Imaging 32 (05 2019). https://doi.org/10.1007/s10278-019-00220-4
[43]
Carole H. Sudre, Wenqi Li, Tom Vercauteren, Sebastien Ourselin, and M. Jorge Cardoso. 2017. Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations. Springer International Publishing, 240--248. https://doi.org/10.1007/978-3-319-67558-9_28
[44]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A. Alemi. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (San Francisco, California, USA) (AAAI'17). AAAI Press, 4278--4284.
[45]
Yaniv Taigman, Adam Polyak, and Lior Wolf. 2017. Unsupervised Cross-Domain Image Generation. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=Sk2Im59ex
[46]
Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 23--30. https://doi.org/10.1109/IROS.2017.8202133
[47]
Liangping Tu and Changqing Dong. 2013. Histogram equalization and image feature matching. In 2013 6th International Congress on Image and Signal Processing (CISP), Vol. 01. 443--447. https://doi.org/10.1109/CISP.2013.6744035
[48]
Can Yaras, Bohao Huang, Kyle Bradbury, and Jordan M. Malof. 2021. Randomized Histogram Matching: A Simple Augmentation for Unsupervised Domain Adaptation in Overhead Imagery. https://doi.org/10.48550/ARXIV.2104.14032
[49]
Yuk-Hoi Yiu, Moustafa Aboulatta, Theresa Raiser, Leoni Ophey, Virginia L. Flanagin, Peter zu Eulenburg, and Seyed-Ahmad Ahmadi. 2019. DeepVOG: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning. Journal of Neuroscience Methods 324 (2019), 108307. https://doi.org/10.1016/j.jneumeth.2019.05.016
[50]
Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. 2015. Appearance-based gaze estimation in the wild. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4511--4520. https://doi.org/10.1109/CVPR.2015.7299081
[51]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). 2242--2251. https://doi.org/10.1109/ICCV.2017.244

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques
Proceedings of the ACM on Computer Graphics and Interactive Techniques  Volume 7, Issue 2
May 2024
101 pages
EISSN:2577-6193
DOI:10.1145/3665652
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2024
Published in PACMCGIT Volume 7, Issue 2

Check for updates

Badges

  • Best Paper

Author Tags

  1. Deep learning
  2. Domain adaptation
  3. Eye segmentation
  4. Eye-tracking
  5. Generative modeling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 481
    Total Downloads
  • Downloads (Last 12 months)481
  • Downloads (Last 6 weeks)103
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media