Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1793))

Included in the following conference series:

Abstract

The advantages of Dynamic Vision Sensor (DVS) camera and Spiking Neuron Networks (SNNs) have attracted much attention in the field of computer vision. However, just as many deep learning models, SNNs also have the problem of overfitting. Especially on DVS datasets, this situation is more severe because DVS datasets are usually smaller than traditional datasets. This paper proposes a data augmentation method for event camera, which augments asynchronous events through random translation and time scaling. The proposed method can effectively improve the diversity of event datasets and thus enhance the generalization ability of models. We use a Liquid State Machine (LSM) model to evaluate our method on two DVS datasets recorded in real scenes, namely DVS128 Gesture Dataset and SL-Animals-DVS Dataset. The experimental results show that our proposed method improves baseline accuracy without any augmentation by 3.99% and 7.3%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amir, A., et al.: A low power, fully event-based gesture recognition system. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7243–7252 (2017)

    Google Scholar 

  2. Camunas-Mesa, L., Acosta-Jimenez, A., Serrano-Gotarredona, T., Linares-Barranco, B.: A digital pixel cell for address event representation image convolution processing. In: Bioengineered and Bioinspired Systems II. vol. 5839, pp. 160–171. SPIE (2005)

    Google Scholar 

  3. Chan, V., Liu, S.C., van Schaik, A.: Aer ear: A matched silicon cochlea pair with address event representation interface. IEEE Transactions on Circuits and Systems I: Regular Papers 54(1), 48–59 (2007)

    Article  Google Scholar 

  4. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 113–123 (2019)

    Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)

  8. Gallego, G., et al.: Event-based vision: a survey. arXiv preprint arXiv:1904.08405 (2019)

  9. Gehrig, D., Loquercio, A., Derpanis, K.G., Scaramuzza, D.: End-to-end learning of representations for asynchronous event-based data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5633–5643 (2019)

    Google Scholar 

  10. Gu, F., Sng, W., Hu, X., Yu, F.: Eventdrop: data augmentation for event-based learning. arXiv preprint arXiv:2106.05836 (2021)

  11. Hendrycks, D., Mu, N., Cubuk, E.D., Zoph, B., Gilmer, J., Lakshminarayanan, B.: Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019)

  12. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  13. Lapique, L.: Recherches quantitatives sur l’excitation electrique des nerfs traitee comme une polarization. Journal of Physiology and Pathololgy 9, 620–635 (1907)

    Google Scholar 

  14. Li, Y., Kim, Y., Park, H., Geller, T., Panda, P.: Neuromorphic data augmentation for training spiking neural networks. arXiv preprint arXiv:2203.06145 (2022)

  15. Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural networks 10(9), 1659–1671 (1997)

    Article  Google Scholar 

  16. Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural computation 14(11), 2531–2560 (2002)

    Article  MATH  Google Scholar 

  17. Mahowald, M.A.: Vlsi analogs of neuronal visual processing: a synthesis of form and function (1992)

    Google Scholar 

  18. Orchard, G., Jayawant, A., Cohen, G.K., Thakor, N.: Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in neuroscience 9, 437 (2015)

    Article  Google Scholar 

  19. Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT). pp. 1–3 (2017). https://doi.org/10.1109/CAIPT.2017.8320684

  20. Patrick, L., Posch, C., Delbruck, T.: A 128 \(\times \) 128 120 db 15\(\mu \) s latency asynchronous temporal contrast vision sensor. IEEE journal of solid-state circuits 43, 566–576 (2008)

    Article  Google Scholar 

  21. Serrano-Gotarredona, T., Linares-Barranco, B.: A 128 \(\times \) 128 1.5% contrast sensitivity 0.9% fpn 3 \(\upmu \)s latency 4 mw asynchronous frame-free dynamic vision sensor using transimpedance preamplifiers. IEEE Journal of Solid-State Circuits 48(3), 827–838 (2013). https://doi.org/10.1109/JSSC.2012.2230553

  22. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. Journal of big data 6(1), 1–48 (2019)

    Article  Google Scholar 

  23. Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.: Hats: Histograms of averaged time surfaces for robust event-based object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1731–1740 (2018)

    Google Scholar 

  24. Stimberg, M., Brette, R., Goodman, D.F.: Brian 2, an intuitive and efficient neural simulator. Elife 8, e47314 (2019)

    Article  Google Scholar 

  25. Vasudevan, A., Negri, P., Linares-Barranco, B., Serrano-Gotarredona, T.: Introduction and analysis of an event-based sign language dataset. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). pp. 675–682. IEEE (2020)

    Google Scholar 

  26. Wang, Q., Zhang, Y., Yuan, J., Lu, Y.: Space-time event clouds for gesture recognition: From rgb cameras to event cameras. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). pp. 1826–1835. IEEE (2019)

    Google Scholar 

  27. Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 6023–6032 (2019)

    Google Scholar 

  28. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  29. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp.eps

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61902408).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiao, X., Chen, X., Kang, Z., Guo, S., Wang, L. (2023). A Spatio-Temporal Event Data Augmentation Method for Dynamic Vision Sensor. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1793. Springer, Singapore. https://doi.org/10.1007/978-981-99-1645-0_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1645-0_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1644-3

  • Online ISBN: 978-981-99-1645-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics