Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3574131.3574457acmconferencesArticle/Chapter ViewAbstractPublication PagessiggraphConference Proceedingsconference-collections
research-article
Open access

6-DoF Pose Relocalization for Event Cameras With Entropy Frame and Attention Networks

Published: 13 January 2023 Publication History

Abstract

Camera relocalization is an important task in computer vision, mainly used in applications such as VR, AR, and robotics. Camera relocalization solves the problem of estimating the 6-DoF camera pose of an input image in a known scene. There are large numbers of research on standard cameras. However, standard cameras have problems such as large power consumption, low frame rate, and poor robustness. Event cameras can make up for the disadvantages of standard cameras. Event data is different from RGB data, it is asynchronous streaming data, most of the processing methods for events convert event data into event images, but these methods can not efficiently generate event images with clear edges at any time, we propose a Reversed Window Entropy Image (RWEI) generation framework for events, which can generate event images with clear edges at any time. We also propose an Attention-guided Event Camera Relocalization Network (AECRN) for utilizing event image characteristics to estimate the pose of the event camera more accurately. We demonstrate our proposed framework and network on public dataset sequences, and experiments show that our proposed method surpasses the previous method.

References

[1]
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In CVPR. 5297–5307.
[2]
Aritra Bhowmik, Stefan Gumhold, Carsten Rother, and Eric Brachmann. 2020. Reinforced feature points: Optimizing feature detection and description for a high-level task. In CVPR. 4948–4957.
[3]
Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, and Carsten Rother. 2017. Dsac-differentiable ransac for camera localization. In CVPR. 6684–6692.
[4]
Eric Brachmann, Frank Michel, Alexander Krull, Michael Ying Yang, Stefan Gumhold, 2016. Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In CVPR. 3364–3372.
[5]
Eric Brachmann and Carsten Rother. 2018. Learning less is more-6d camera localization via 3d surface regression. In CVPR. 4654–4662.
[6]
Eric Brachmann and Carsten Rother. 2019. Expert sample consensus applied to camera re-localization. In ICCV. 7525–7534.
[7]
Eric Brachmann and Carsten Rother. 2021. Visual camera re-localization from RGB and RGB-D images using DSAC. TPAMI 44, 9 (2021), 5847–5865.
[8]
Christian Brandli, Raphael Berner, Minhao Yang, Shih-Chii Liu, and Tobi Delbruck. 2014. A 240 × 180 130 db 3 μs latency global shutter spatiotemporal vision sensor. IJSC 49, 10 (2014), 2333–2341.
[9]
Haosheng Chen, Qiangqiang Wu, Yanjie Liang, Xinbo Gao, and Hanzi Wang. 2019. Asynchronous tracking-by-detection on adaptive time surfaces for event-based object tracking. In ACM MM. 473–481.
[10]
Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, and Leonidas J Guibas. 2021. Robust neural routing through space partitions for camera relocalization in dynamic indoor environments. In CVPR. 8544–8554.
[11]
Guillermo Gallego, Henri Rebecq, and Davide Scaramuzza. 2018. A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation. In CVPR. 3867–3876.
[12]
Abner Guzman-Rivera, Pushmeet Kohli, Ben Glocker, Jamie Shotton, Toby Sharp, Andrew Fitzgibbon, and Shahram Izadi. 2014. Multi-output learning for camera relocalization. In CVPR. 1114–1121.
[13]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In CVPR. 7132–7141.
[14]
Alex Kendall and Roberto Cipolla. 2016. Modelling uncertainty in deep learning for camera relocalization. In ICRA. 4762–4769.
[15]
Alex Kendall and Roberto Cipolla. 2017. Geometric loss functions for camera pose regression with deep learning. In CVPR. 5974–5983.
[16]
Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Posenet: A convolutional network for real-time 6-DOF camera relocalization. In ICCV. 2938–2946.
[17]
Hanme Kim, Ankur Handa, Ryad Benosman, Sio-Hoi Ieng, and Andrew J Davison. 2008. Simultaneous mosaicing and tracking with an event camera. JSSC 43(2008), 566–576.
[18]
Hanme Kim, Stefan Leutenegger, and Andrew J Davison. 2016. Real-time 3D reconstruction and 6-DoF tracking with an event camera. In ECCV. 349–364.
[19]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[20]
Xavier Lagorce, Garrick Orchard, Francesco Galluppi, Bertram E Shi, and Ryad B Benosman. 2016. Hots: a hierarchy of event-based time-surfaces for pattern recognition. TPAMI 39, 7 (2016), 1346–1359.
[21]
Xiaotian Li, Juha Ylioinas, Jakob Verbeek, and Juho Kannala. 2018. Scene coordinate regression with angle-based reprojection loss for camera relocalization. In ECCV. 0–0.
[22]
Yunpeng Li, Noah Snavely, Dan Huttenlocher, and Pascal Fua. 2012. Worldwide pose estimation using 3d point clouds. In ECCV. 15–29.
[23]
Hyon Lim, Sudipta N Sinha, Michael F Cohen, and Matthew Uyttendaele. 2012. Real-time image-based 6-dof localization in large-scale environments. In 2012 CVPR. 1043–1050.
[24]
Min Liu and Tobi Delbruck. 2018. Adaptive time-slice block-matching optical flow algorithm for dynamic vision sensors. In BMVC. 0–0.
[25]
Weng Fei Low, Ankit Sonthalia, Zhi Gao, André van Schaik, and Bharath Ramesh. 2021. Superevents: Towards native semantic segmentation for event-based cameras. In NS. 1–8.
[26]
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. IJCV 60, 2 (2004), 91–110.
[27]
Daniela Massiceti, Alexander Krull, Eric Brachmann, Carsten Rother, and Philip HS Torr. 2017. Random forests versus neural networks—what’s best for camera localization?. In ICRA. 5118–5125.
[28]
Lili Meng, Jianhui Chen, Frederick Tung, James J Little, Julien Valentin, and Clarence W de Silva. 2017. Backtracking regression forests for accurate camera relocalization. In IROS. 6886–6893.
[29]
Lili Meng, Frederick Tung, James J Little, Julien Valentin, and Clarence W de Silva. 2018. Exploiting points and lines in regression forests for RGB-D camera relocalization. In IROS. 6827–6834.
[30]
Nico Messikommer, Daniel Gehrig, Antonio Loquercio, and Davide Scaramuzza. 2020. Event-based asynchronous sparse convolutional networks. In ECCV. 415–431.
[31]
Elias Mueggler, Basil Huber, and Davide Scaramuzza. 2014. Event-based, 6-DOF pose tracking for high-speed maneuvers. In IRoS. 2761–2768.
[32]
Elias Mueggler, Henri Rebecq, Guillermo Gallego, Tobi Delbruck, and Davide Scaramuzza. 2017. The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM. IJRR 36, 2 (2017), 142–149.
[33]
Tayyab Naseer and Wolfram Burgard. 2017. Deep regression for monocular camera-based 6-dof global localization in outdoor environments. In IROS. 1525–1530.
[34]
Anh Nguyen, Thanh-Toan Do, Darwin G. Caldwell, and Nikos G. Tsagarakis. 2019. Real-Time 6DOF pose relocalization for event cameras with stacked spatial LSTM networks. In CVPR Workshops. 0–0.
[35]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. NIPS 32(2019), 8026–8037.
[36]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR. 652–660.
[37]
Yu Qiao, Jincheng Zhu, Chengjiang Long, Zeyao Zhang, Yuxin Wang, Zhenjun Du, and Xin Yang. 2022. Cpral: Collaborative panoptic-regional active learning for semantic segmentation. In AAAI, Vol. 36. 2108–2116.
[38]
Henri Rebecq, Timo Horstschäfer, Guillermo Gallego, and Davide Scaramuzza. 2016. Evo: A geometric approach to event-based 6-dof parallel tracking and mapping in real time. RAL 2, 2 (2016), 593–600.
[39]
Torsten Sattler, Michal Havlena, Filip Radenovic, Konrad Schindler, and Marc Pollefeys. 2015. Hyperpoints and fine vocabularies for large-scale location recognition. In ICCV. 2102–2110.
[40]
Torsten Sattler, Michal Havlena, Konrad Schindler, and Marc Pollefeys. 2016. Large-scale location recognition and the geometric burstiness problem. In CVPR. 1582–1590.
[41]
Torsten Sattler, Akihiko Torii, Josef Sivic, Marc Pollefeys, Hajime Taira, Masatoshi Okutomi, and Tomas Pajdla. 2017. Are large-scale 3d models really necessary for accurate visual localization?. In CVPR. 1637–1646.
[42]
Grant Schindler, Matthew Brown, and Richard Szeliski. 2007. City-scale location recognition. In CVPR. 1–7.
[43]
Amos Sironi, Manuele Brambilla, Nicolas Bourdis, Xavier Lagorce, and Ryad Benosman. 2018. HATS: Histograms of averaged time surfaces for robust event-based object classification. In CVPR. 1731–1740.
[44]
Linus Svärm, Olof Enqvist, Fredrik Kahl, and Magnus Oskarsson. 2016. City-scale localization for cameras with known vertical direction. TPAMI 39, 7 (2016), 1455–1461.
[45]
Linus Svarm, Olof Enqvist, Magnus Oskarsson, and Fredrik Kahl. 2014. Accurate localization and pose estimation for large 3d models. In CVPR. 532–539.
[46]
Hajime Taira, Masatoshi Okutomi, Torsten Sattler, Mircea Cimpoi, Marc Pollefeys, Josef Sivic, Tomas Pajdla, and Akihiko Torii. 2018. InLoc: Indoor visual localization with dense matching and view synthesis. In CVPR. 7199–7209.
[47]
Akihiko Torii, Relja Arandjelovic, Josef Sivic, Masatoshi Okutomi, and Tomas Pajdla. 2015. 24/7 place recognition by view synthesis. In CVPR. 1808–1817.
[48]
Florian Walch, Caner Hazirbas, Laura Leal-Taixe, Torsten Sattler, Sebastian Hilsenbeck, and Daniel Cremers. 2017. Image-based localization using lstms for structured feature correlation. In ICCV. 627–637.
[49]
Xin Yang, Yu Qiao, Shaozhe Chen, Shengfeng He, Baocai Yin, Qiang Zhang, Xiaopeng Wei, and Rynson W. H. Lau. 2020. Smart Scribbles for Image Matting. ACM TOMM 16, 4 (2020), 1–21.
[50]
Jiqing Zhang, Bo Dong, Haiwei Zhang, Jianchuan Ding, Felix Heide, Baocai Yin, and Xin Yang. 2022. Spiking transformers for event-based single object tracking. In CVPR. 8801–8810.
[51]
Wei Zhang and Jana Kosecka. 2006. Image based localization in urban environments. In 3DPVT. 33–40.
[52]
Yi Zhou, Guillermo Gallego, and Shaojie Shen. 2021. Event-based stereo visual odometry. TR 37, 5 (2021), 1433–1450.

Cited By

View all

Index Terms

  1. 6-DoF Pose Relocalization for Event Cameras With Entropy Frame and Attention Networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry
    December 2022
    284 pages
    ISBN:9798400700316
    DOI:10.1145/3574131
    • Editors:
    • Enhua Wu,
    • Lionel Ming-Shuan Ni,
    • Zhigeng Pan,
    • Daniel Thalmann,
    • Ping Li,
    • Charlie C.L. Wang,
    • Lei Zhu,
    • Minghao Yang
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 January 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. camera relocalization
    2. entropy image
    3. event camera
    4. event image

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    VRCAI '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 51 of 107 submissions, 48%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 421
      Total Downloads
    • Downloads (Last 12 months)202
    • Downloads (Last 6 weeks)37
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media