research-article

Open access

6-DoF Pose Relocalization for Event Cameras With Entropy Frame and Attention Networks

Authors:

Xin YangAuthors Info & Claims

VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

Article No.: 29, Pages 1 - 8

https://doi.org/10.1145/3574131.3574457

Published: 13 January 2023 Publication History

All formats PDF

Abstract

Camera relocalization is an important task in computer vision, mainly used in applications such as VR, AR, and robotics. Camera relocalization solves the problem of estimating the 6-DoF camera pose of an input image in a known scene. There are large numbers of research on standard cameras. However, standard cameras have problems such as large power consumption, low frame rate, and poor robustness. Event cameras can make up for the disadvantages of standard cameras. Event data is different from RGB data, it is asynchronous streaming data, most of the processing methods for events convert event data into event images, but these methods can not efficiently generate event images with clear edges at any time, we propose a Reversed Window Entropy Image (RWEI) generation framework for events, which can generate event images with clear edges at any time. We also propose an Attention-guided Event Camera Relocalization Network (AECRN) for utilizing event image characteristics to estimate the pose of the event camera more accurately. We demonstrate our proposed framework and network on public dataset sequences, and experiments show that our proposed method surpasses the previous method.

References

[1]

Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In CVPR. 5297–5307.

[2]

Aritra Bhowmik, Stefan Gumhold, Carsten Rother, and Eric Brachmann. 2020. Reinforced feature points: Optimizing feature detection and description for a high-level task. In CVPR. 4948–4957.

[3]

Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, and Carsten Rother. 2017. Dsac-differentiable ransac for camera localization. In CVPR. 6684–6692.

[4]

Eric Brachmann, Frank Michel, Alexander Krull, Michael Ying Yang, Stefan Gumhold, 2016. Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In CVPR. 3364–3372.

[5]

Eric Brachmann and Carsten Rother. 2018. Learning less is more-6d camera localization via 3d surface regression. In CVPR. 4654–4662.

[6]

Eric Brachmann and Carsten Rother. 2019. Expert sample consensus applied to camera re-localization. In ICCV. 7525–7534.

[7]

Eric Brachmann and Carsten Rother. 2021. Visual camera re-localization from RGB and RGB-D images using DSAC. TPAMI 44, 9 (2021), 5847–5865.

[8]

Christian Brandli, Raphael Berner, Minhao Yang, Shih-Chii Liu, and Tobi Delbruck. 2014. A 240 × 180 130 db 3 μs latency global shutter spatiotemporal vision sensor. IJSC 49, 10 (2014), 2333–2341.

[9]

Haosheng Chen, Qiangqiang Wu, Yanjie Liang, Xinbo Gao, and Hanzi Wang. 2019. Asynchronous tracking-by-detection on adaptive time surfaces for event-based object tracking. In ACM MM. 473–481.

[10]

Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, and Leonidas J Guibas. 2021. Robust neural routing through space partitions for camera relocalization in dynamic indoor environments. In CVPR. 8544–8554.

[11]

Guillermo Gallego, Henri Rebecq, and Davide Scaramuzza. 2018. A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation. In CVPR. 3867–3876.

[12]

Abner Guzman-Rivera, Pushmeet Kohli, Ben Glocker, Jamie Shotton, Toby Sharp, Andrew Fitzgibbon, and Shahram Izadi. 2014. Multi-output learning for camera relocalization. In CVPR. 1114–1121.

[13]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In CVPR. 7132–7141.

[14]

Alex Kendall and Roberto Cipolla. 2016. Modelling uncertainty in deep learning for camera relocalization. In ICRA. 4762–4769.

[15]

Alex Kendall and Roberto Cipolla. 2017. Geometric loss functions for camera pose regression with deep learning. In CVPR. 5974–5983.

[16]

Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Posenet: A convolutional network for real-time 6-DOF camera relocalization. In ICCV. 2938–2946.

[17]

Hanme Kim, Ankur Handa, Ryad Benosman, Sio-Hoi Ieng, and Andrew J Davison. 2008. Simultaneous mosaicing and tracking with an event camera. JSSC 43(2008), 566–576.

[18]

Hanme Kim, Stefan Leutenegger, and Andrew J Davison. 2016. Real-time 3D reconstruction and 6-DoF tracking with an event camera. In ECCV. 349–364.

[19]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[20]

Xavier Lagorce, Garrick Orchard, Francesco Galluppi, Bertram E Shi, and Ryad B Benosman. 2016. Hots: a hierarchy of event-based time-surfaces for pattern recognition. TPAMI 39, 7 (2016), 1346–1359.

Digital Library

[21]

Xiaotian Li, Juha Ylioinas, Jakob Verbeek, and Juho Kannala. 2018. Scene coordinate regression with angle-based reprojection loss for camera relocalization. In ECCV. 0–0.

[22]

Yunpeng Li, Noah Snavely, Dan Huttenlocher, and Pascal Fua. 2012. Worldwide pose estimation using 3d point clouds. In ECCV. 15–29.

[23]

Hyon Lim, Sudipta N Sinha, Michael F Cohen, and Matthew Uyttendaele. 2012. Real-time image-based 6-dof localization in large-scale environments. In 2012 CVPR. 1043–1050.

[24]

Min Liu and Tobi Delbruck. 2018. Adaptive time-slice block-matching optical flow algorithm for dynamic vision sensors. In BMVC. 0–0.

[25]

Weng Fei Low, Ankit Sonthalia, Zhi Gao, André van Schaik, and Bharath Ramesh. 2021. Superevents: Towards native semantic segmentation for event-based cameras. In NS. 1–8.

[26]

David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. IJCV 60, 2 (2004), 91–110.

Digital Library

[27]

Daniela Massiceti, Alexander Krull, Eric Brachmann, Carsten Rother, and Philip HS Torr. 2017. Random forests versus neural networks—what’s best for camera localization?. In ICRA. 5118–5125.

[28]

Lili Meng, Jianhui Chen, Frederick Tung, James J Little, Julien Valentin, and Clarence W de Silva. 2017. Backtracking regression forests for accurate camera relocalization. In IROS. 6886–6893.

[29]

Lili Meng, Frederick Tung, James J Little, Julien Valentin, and Clarence W de Silva. 2018. Exploiting points and lines in regression forests for RGB-D camera relocalization. In IROS. 6827–6834.

[30]

Nico Messikommer, Daniel Gehrig, Antonio Loquercio, and Davide Scaramuzza. 2020. Event-based asynchronous sparse convolutional networks. In ECCV. 415–431.

[31]

Elias Mueggler, Basil Huber, and Davide Scaramuzza. 2014. Event-based, 6-DOF pose tracking for high-speed maneuvers. In IRoS. 2761–2768.

[32]

Elias Mueggler, Henri Rebecq, Guillermo Gallego, Tobi Delbruck, and Davide Scaramuzza. 2017. The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM. IJRR 36, 2 (2017), 142–149.

Digital Library

[33]

Tayyab Naseer and Wolfram Burgard. 2017. Deep regression for monocular camera-based 6-dof global localization in outdoor environments. In IROS. 1525–1530.

[34]

Anh Nguyen, Thanh-Toan Do, Darwin G. Caldwell, and Nikos G. Tsagarakis. 2019. Real-Time 6DOF pose relocalization for event cameras with stacked spatial LSTM networks. In CVPR Workshops. 0–0.

[35]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. NIPS 32(2019), 8026–8037.

[36]

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR. 652–660.

[37]

Yu Qiao, Jincheng Zhu, Chengjiang Long, Zeyao Zhang, Yuxin Wang, Zhenjun Du, and Xin Yang. 2022. Cpral: Collaborative panoptic-regional active learning for semantic segmentation. In AAAI, Vol. 36. 2108–2116.

[38]

Henri Rebecq, Timo Horstschäfer, Guillermo Gallego, and Davide Scaramuzza. 2016. Evo: A geometric approach to event-based 6-dof parallel tracking and mapping in real time. RAL 2, 2 (2016), 593–600.

[39]

Torsten Sattler, Michal Havlena, Filip Radenovic, Konrad Schindler, and Marc Pollefeys. 2015. Hyperpoints and fine vocabularies for large-scale location recognition. In ICCV. 2102–2110.

[40]

Torsten Sattler, Michal Havlena, Konrad Schindler, and Marc Pollefeys. 2016. Large-scale location recognition and the geometric burstiness problem. In CVPR. 1582–1590.

[41]

Torsten Sattler, Akihiko Torii, Josef Sivic, Marc Pollefeys, Hajime Taira, Masatoshi Okutomi, and Tomas Pajdla. 2017. Are large-scale 3d models really necessary for accurate visual localization?. In CVPR. 1637–1646.

[42]

Grant Schindler, Matthew Brown, and Richard Szeliski. 2007. City-scale location recognition. In CVPR. 1–7.

[43]

Amos Sironi, Manuele Brambilla, Nicolas Bourdis, Xavier Lagorce, and Ryad Benosman. 2018. HATS: Histograms of averaged time surfaces for robust event-based object classification. In CVPR. 1731–1740.

[44]

Linus Svärm, Olof Enqvist, Fredrik Kahl, and Magnus Oskarsson. 2016. City-scale localization for cameras with known vertical direction. TPAMI 39, 7 (2016), 1455–1461.

Digital Library

[45]

Linus Svarm, Olof Enqvist, Magnus Oskarsson, and Fredrik Kahl. 2014. Accurate localization and pose estimation for large 3d models. In CVPR. 532–539.

[46]

Hajime Taira, Masatoshi Okutomi, Torsten Sattler, Mircea Cimpoi, Marc Pollefeys, Josef Sivic, Tomas Pajdla, and Akihiko Torii. 2018. InLoc: Indoor visual localization with dense matching and view synthesis. In CVPR. 7199–7209.

[47]

Akihiko Torii, Relja Arandjelovic, Josef Sivic, Masatoshi Okutomi, and Tomas Pajdla. 2015. 24/7 place recognition by view synthesis. In CVPR. 1808–1817.

[48]

Florian Walch, Caner Hazirbas, Laura Leal-Taixe, Torsten Sattler, Sebastian Hilsenbeck, and Daniel Cremers. 2017. Image-based localization using lstms for structured feature correlation. In ICCV. 627–637.

[49]

Xin Yang, Yu Qiao, Shaozhe Chen, Shengfeng He, Baocai Yin, Qiang Zhang, Xiaopeng Wei, and Rynson W. H. Lau. 2020. Smart Scribbles for Image Matting. ACM TOMM 16, 4 (2020), 1–21.

Digital Library

[50]

Jiqing Zhang, Bo Dong, Haiwei Zhang, Jianchuan Ding, Felix Heide, Baocai Yin, and Xin Yang. 2022. Spiking transformers for event-based single object tracking. In CVPR. 8801–8810.

[51]

Wei Zhang and Jana Kosecka. 2006. Image based localization in urban environments. In 3DPVT. 33–40.

[52]

Yi Zhou, Guillermo Gallego, and Shaojie Shen. 2021. Event-based stereo visual odometry. TR 37, 5 (2021), 1433–1450.

Cited By

Index Terms

6-DoF Pose Relocalization for Event Cameras With Entropy Frame and Attention Networks
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
      1. Mixed / augmented reality

Recommendations

A Multi-Camera 6-DOF Pose Tracker
ISMAR '04: Proceedings of the 3rd IEEE/ACM International Symposium on Mixed and Augmented Reality

Most of the work in head-pose tracking has concentrated on single-camera systems with a relatively small field of view which have limited accuracy because features are only observed in a single viewing direction. We present a multi-camera pose tracker ...
Standard and Event Cameras Fusion for Feature Tracking
ICMVA '21: Proceedings of the 2021 International Conference on Machine Vision and Applications

Standard cameras are frame-based sensors that capture the scene at a fixed rate. They cannot provide information between two frames and suffer from the motion blur problem in high-speed robotic and vision applications. By contrast, event-based cameras ...
Efficient 6-DoF camera pose tracking with circular edges
Abstract
Camera pose tracking attracts much interest from both academic and industrial communities, of which the methods based on planar markers are easy to be implemented. However, most existing methods need to identify multiple points in the marker ...
Highlights
- 6D camera pose is represented analytically as concise forms from circular edges.
- An optimization method is proposed based on a polar-n-direction geometric distance.
- Experimental results show the proposed method is robust to noise, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

December 2022

284 pages

ISBN:9798400700316

DOI:10.1145/3574131

Editors:
Enhua Wu
SKLCS, Chinese Academy of Sciences / FST, University of Macau / Guangzhou Greater Bay Area Virtual Reality Research Institute, China
,
Lionel Ming-Shuan Ni
The Hong Kong University of Science and Technology (Guangzhou) & The Hong Kong University of Science and Technology, China
,
Zhigeng Pan
Nanjing University of Information Science & Technology / Hangzhou Normal University, China
,
Daniel Thalmann
École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
,
Ping Li
The Hong Kong Polytechnic University, Hong Kong, China
,
Charlie C.L. Wang
The University of Manchester, U.K.
,
Lei Zhu
The Hong Kong University of Science and Technology (Guangzhou) & The Hong Kong University of Science and Technology, China
,
Minghao Yang
Institute of Automation, Chinese Academy of Sciences, China

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

VRCAI '22

Sponsor:

SIGGRAPH

VRCAI '22: The 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

December 27 - 29, 2022

Guangzhou, China

Acceptance Rates

Overall Acceptance Rate 51 of 107 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
421
Total Downloads

Downloads (Last 12 months)202
Downloads (Last 6 weeks)37

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents