Abstract
We describe the recent development of assistive computer vision algorithms for use with the Argus II retinal prosthesis system. While users of the prosthetic system can learn and adapt to the limited stimulation resolution, there exists great potential for computer vision algorithms to augment the experience and significantly increase the utility of the system for the user. To this end, our recent work has focused on helping with two different challenges encountered by the visually impaired: face detection and object recognition. In this paper, we describe algorithm implementations in both of these areas that make use of the retinal prosthesis for visual feedback to the user, and discuss the unique challenges faced in this domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The speech synthesis and recognition modules run asynchronously from the vision algorithm and their computational demands are minimal compared to the CNN detection/tracking portion.
References
Black, A.W., Taylor, P.A.: The festival speech synthesis system: system documentation. Technical report HCRC/TR-83, Human Communciation Research Centre, University of Edinburgh, Scotland, UK (1997). http://www.cstr.ed.ac.uk/projects/festival.html
Bradski, G.: The OpenCV Library. Dr. Dobb’s J. Softw. Tools (2000). http://code.opencv.org/projects/opencv/wiki/CiteOpenCV
Burlina, P.: MR-CNN: a stateful fast R-CNN. In: International Conference on Pattern Recognition (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. CoRR abs/1502.01852 (2015)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:1408.5093
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)
Kang, K., Li, H., Yan, J., Zeng, X., Yang, B., Xiao, T., Zhang, C., Wang, Z., Wang, R., Wang, X., Ouyang, W.: T-CNN: tubelets with convolutional neural networks for object detection from videos. CoRR abs/1604.02532 (2016)
Kang, K., Ouyang, W., Li, H., Wang, X.: Object detection from video tubelets with convolutional neural networks. CoRR abs/1604.04053 (2016)
Liao, S., Zhu, X., Lei, Z., Zhang, L., Li, S.Z.: Learning Multi-scale Block Local Binary Patterns for Face Recognition. In: Lee, S.-W., Li, S.Z. (eds.) ICB 2007. LNCS, vol. 4642, pp. 828–837. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74549-5_87
Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014)
Liu, W.: SSD Caffe (2015). https://github.com/weiliu89/caffe/tree/ssd
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E.: SSD: single shot multibox detector. CoRR abs/1512.02325 (2015)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. CoRR abs/1506.02640 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Rollend, D., Rosendall, P., Wolfe, K., Kleissas, D., Billings, S., Oben, J., Helder, J., Tenore, F., Burlina, P., Roy, A., Greenberg, R., Katyal, K.: Embedded clutter reduction and face detection algorithms for a visual prosthesis. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), August 2016
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)
Stanga, P., Sahel, J., Mohand-Said, S., daCruz, L., Caspi, A., Merlini, F., Greenberg, R.: Face detection using the argus II retinal prosthesis system. Invest. Ophthalmol. Vis. Sci. 54, 1766 (2013)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, pp. I-511-I-518 (2001)
Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P., Woelfel, J.: Sphinx-4: a flexible open source framework for speech recognition. Technical report, Mountain View, CA, USA (2004)
Acknowledgement
This work was supported by an Alfred E. Mann collaboration grant. We would also like to thank Arup Roy, Avi Caspi, and Robert Greenberg, our collaborators from Second Sight Medical Products.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rollend, D., Rosendall, P., Billings, S., Burlina, P., Wolfe, K., Katyal, K. (2017). Face Detection and Object Recognition for a Retinal Prosthesis. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10116. Springer, Cham. https://doi.org/10.1007/978-3-319-54407-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-54407-6_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54406-9
Online ISBN: 978-3-319-54407-6
eBook Packages: Computer ScienceComputer Science (R0)