A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor

Xun Xiao¹⁰,
Xiaofan Chen¹⁰,
Ziyang Kang¹⁰,
Shasha Guo¹⁰ &
…
Lei Wang¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1793))

Included in the following conference series:

International Conference on Neural Information Processing

1132 Accesses
1 Citations

Abstract

The advantages of Dynamic Vision Sensor (DVS) camera and Spiking Neuron Networks (SNNs) have attracted much attention in the field of computer vision. However, just as many deep learning models, SNNs also have the problem of overfitting. Especially on DVS datasets, this situation is more severe because DVS datasets are usually smaller than traditional datasets. This paper proposes a data augmentation method for event camera, which augments asynchronous events through random translation and time scaling. The proposed method can effectively improve the diversity of event datasets and thus enhance the generalization ability of models. We use a Liquid State Machine (LSM) model to evaluate our method on two DVS datasets recorded in real scenes, namely DVS128 Gesture Dataset and SL-Animals-DVS Dataset. The experimental results show that our proposed method improves baseline accuracy without any augmentation by 3.99% and 7.3%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Dynamic Vision Sensor Based Gesture Recognition Using Liquid State Machine

Razor SNN: Efficient Spiking Neural Network with Temporal Embeddings

UAV-GESTURE: A Dataset for UAV Control and Gesture Recognition

References

Amir, A., et al.: A low power, fully event-based gesture recognition system. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7243–7252 (2017)
Google Scholar
Camunas-Mesa, L., Acosta-Jimenez, A., Serrano-Gotarredona, T., Linares-Barranco, B.: A digital pixel cell for address event representation image convolution processing. In: Bioengineered and Bioinspired Systems II. vol. 5839, pp. 160–171. SPIE (2005)
Google Scholar
Chan, V., Liu, S.C., van Schaik, A.: Aer ear: A matched silicon cochlea pair with address event representation interface. IEEE Transactions on Circuits and Systems I: Regular Papers 54(1), 48–59 (2007)
Article Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 113–123 (2019)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Gallego, G., et al.: Event-based vision: a survey. arXiv preprint arXiv:1904.08405 (2019)
Gehrig, D., Loquercio, A., Derpanis, K.G., Scaramuzza, D.: End-to-end learning of representations for asynchronous event-based data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5633–5643 (2019)
Google Scholar
Gu, F., Sng, W., Hu, X., Yu, F.: Eventdrop: data augmentation for event-based learning. arXiv preprint arXiv:2106.05836 (2021)
Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Lapique, L.: Recherches quantitatives sur l’excitation electrique des nerfs traitee comme une polarization. Journal of Physiology and Pathololgy 9, 620–635 (1907)
Google Scholar
Li, Y., Kim, Y., Park, H., Geller, T., Panda, P.: Neuromorphic data augmentation for training spiking neural networks. arXiv preprint arXiv:2203.06145 (2022)
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural networks 10(9), 1659–1671 (1997)
Article Google Scholar
Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural computation 14(11), 2531–2560 (2002)
Article MATH Google Scholar
Mahowald, M.A.: Vlsi analogs of neuronal visual processing: a synthesis of form and function (1992)
Google Scholar
Orchard, G., Jayawant, A., Cohen, G.K., Thakor, N.: Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in neuroscience 9, 437 (2015)
Article Google Scholar
Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT). pp. 1–3 (2017). https://doi.org/10.1109/CAIPT.2017.8320684
Patrick, L., Posch, C., Delbruck, T.: A 128 $\times $ 128 120 db 15$\mu $ s latency asynchronous temporal contrast vision sensor. IEEE journal of solid-state circuits 43, 566–576 (2008)
Article Google Scholar
Serrano-Gotarredona, T., Linares-Barranco, B.: A 128 $\times $ 128 1.5% contrast sensitivity 0.9% fpn 3 $\upmu $s latency 4 mw asynchronous frame-free dynamic vision sensor using transimpedance preamplifiers. IEEE Journal of Solid-State Circuits 48(3), 827–838 (2013). https://doi.org/10.1109/JSSC.2012.2230553
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. Journal of big data 6(1), 1–48 (2019)
Article Google Scholar
Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.: Hats: Histograms of averaged time surfaces for robust event-based object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1731–1740 (2018)
Google Scholar
Stimberg, M., Brette, R., Goodman, D.F.: Brian 2, an intuitive and efficient neural simulator. Elife 8, e47314 (2019)
Article Google Scholar
Vasudevan, A., Negri, P., Linares-Barranco, B., Serrano-Gotarredona, T.: Introduction and analysis of an event-based sign language dataset. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). pp. 675–682. IEEE (2020)
Google Scholar
Wang, Q., Zhang, Y., Yuan, J., Lu, Y.: Space-time event clouds for gesture recognition: From rgb cameras to event cameras. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). pp. 1826–1835. IEEE (2019)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 6023–6032 (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp.eps
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61902408).

Author information

Authors and Affiliations

College of Computer Science and Technology, National University of Defense Technology, Changsha, 410073, China
Xun Xiao, Xiaofan Chen, Ziyang Kang, Shasha Guo & Lei Wang

Authors

Xun Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ziyang Kang
View author publications
You can also search for this author in PubMed Google Scholar
Shasha Guo
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Wang .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiao, X., Chen, X., Kang, Z., Guo, S., Wang, L. (2023). A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1793. Springer, Singapore. https://doi.org/10.1007/978-981-99-1645-0_35

Download citation

DOI: https://doi.org/10.1007/978-981-99-1645-0_35
Published: 14 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1644-3
Online ISBN: 978-981-99-1645-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Vision Sensor Based Gesture Recognition Using Liquid State Machine

Razor SNN: Efficient Spiking Neural Network with Temporal Embeddings

UAV-GESTURE: A Dataset for UAV Control and Gesture Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Vision Sensor Based Gesture Recognition Using Liquid State Machine

Razor SNN: Efficient Spiking Neural Network with Temporal Embeddings

UAV-GESTURE: A Dataset for UAV Control and Gesture Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation