research-article

Online deep Bingham network for probabilistic orientation estimation

Authors:

Lijun ChenAuthors Info & Claims

IET Computer Vision, Volume 17, Issue 6

Pages 663 - 675

https://doi.org/10.1049/cvi2.12188

Published: 14 March 2023 Publication History

Abstract

Orientation estimation is one of the core problems in several computer vision tasks. Recently deep learning techniques combined with the Bingham distribution have attracted considerable interest towards this problem when considering ambiguities and rotational symmetries of objects. However, existing works suffer from two issues. First, the computational overhead for calculating the normalisation constant of the Bingham distribution is relatively high. Second, the choice of loss functions is uncertain. In light of these problems, we present an online deep Bingham network to estimate the orientation of objects. We sharply reduce the computational overhead of the normalisation constant by directly applying a numerical integration formula. Additionally, we are the first to give theorems on the convexity and Lipschitz continuity of the Bingham distribution's negative log‐likelihood, which formally indicates that it is a proper choice of the loss function. We test our method on three public datasets, namely the UPNA, the T‐LESS and Pascal3D+, showing that our method outperforms the state‐of‐the‐art in terms of orientation accuracy and time efficiency, which can reduce the runtime by more than 6 h compared to the offline methods. The ablation experiments further demonstrate the effectiveness and robustness of our model.

Graphical Abstract

In terms of computation overhead of the Bingham normalisation constant and uncertain choice of the loss function, we present an online deep Bingham network to estimate the orientation of objects and give theorems of the chosen loss function. The result shows that our method outperforms the state‐of‐the‐art in orientation accuracy and time efficiency.

References

[1]

Zhou, Y., Tuzel, O.: Voxelnet: end‐to‐end learning for point cloud based 3d object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)

[2]

Marchand, É., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands‐on survey. IEEE Trans. Visual. Comput. Graph. 22(12), 2633–2651 (2016). https://doi.org/10.1109/tvcg.2015.2513408

Digital Library

[3]

Grigorescu, S.M., et al.: A survey of deep learning techniques for autonomous driving. J. Field Robot. 37(3), 362–386 (2020). https://doi.org/10.1002/rob.21918

[4]

Barfoot, T.D.: State Estimation for Robotics. Cambridge University Press

[5]

Piasco, N., et al.: A survey on visual‐based localization: on the benefit of heterogeneous data. Pattern Recogn. 74, 90–109 (2018). https://doi.org/10.1016/j.patcog.2017.09.013

Digital Library

[6]

Salas‐Moreno, R.F., et al.: Slam++: simultaneous localisation and mapping at the level of objects. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352–1359 (2013)

[7]

Zhou, Y., et al.: On the continuity of rotation representations in neural networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5738–5746 (2019)

[8]

Peretroukhin, V., et al.: A smooth representation of SO(3) for deep rotation learning with uncertainty. In: Proceedings of Robotics: Science and Systems (RSS’20) (2020)

[9]

Srivatsan, R.A., et al.: Bingham distribution‐based linear filter for online pose estimation. In: Robotics: Science and Systems (2017)

[10]

Gilitschenski, I., et al.: Deep orientation uncertainty learning based on a bingham loss. In: International Conference on Learning Representations (2019)

[11]

Mohlin, D., Sullivan, J., Bianchi, G.: Probabilistic orientation estimation with matrix Fisher distributions. In: Advances in Neural Information Processing Systems, vol. 33, pp. 4884–4893 (2020)

[12]

Prokudin, S., Gehler, P., Nowozin, S.: Deep directional statistics: pose estimation with uncertainty quantification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 534–551 (2018)

[13]

Deng, H., et al.: Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose Estimation (2020). arXiv preprint arXiv:2012.11002

[14]

Ariz, M., et al.: A novel 2d/3d database with automatic face annotation for head tracking and pose estimation. Comput. Vis. Image Understand. 148, 201–210 (2016). https://doi.org/10.1016/j.cviu.2015.04.009

Digital Library

[15]

Hodaň, T., et al.: T‐LESS: an RGB‐D dataset for 6D pose estimation of texture‐less objects. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2017)

[16]

Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: a benchmark for 3d object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision, pp. 75–82 (2014)

[17]

Horn, B.K.P.: Closed‐form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A. 4, 629–642 (1987). https://doi.org/10.1364/josaa.4.000629

[18]

Scaramuzza, D., Fraundorfer, F.: Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 18(4), 80–92 (2011). https://doi.org/10.1109/mra.2011.943233

[19]

Mei, J., Jiang, X., Ding, H.: Spatial feature mapping for 6dof object pose estimation. Pattern Recogn. 131, 108835 (2022). https://doi.org/10.1016/j.patcog.2022.108835

Digital Library

[20]

Cohen, T.S., et al.: Spherical CNNS. arXiv preprint arXiv:1801.10130 (2018)

[21]

Esteves, C., et al.: Learning so (3) equivariant representations with spherical cnns. In: Proceedings of the European Conference on Computer Vision, pp. 52–68 (2018)

Digital Library

[22]

Hartley, R., et al.: Rotation averaging. Int. J. Comput. Vis. 103(3), 267–305 (2013). https://doi.org/10.1007/s11263-012-0601-0

[23]

Chatterjee, A., Govindu, V.M.: Robust relative rotation averaging. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 958–972 https://doi.org/10.1109/tpami.2017.2693984(2018)

[24]

Peretroukhin, V., Wagstaff, B., Kelly, J.: Deep probabilistic regression of elements of so (3) using quaternion averaging and uncertainty injection. In: CVPR Workshops, pp. 83–86 (2019)

[25]

Purkait, P., Chin, T.‐J., Reid, I.: Neurora: neural robust rotation averaging. In: ECCV (2020)

[26]

Berger, J.O.: Statistical Decision Theory and Bayesian Analysis (1988)

[27]

Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. ArXiv abs/1506.02142 (2016)

[28]

Kendall, A., Cipolla, R.: Modelling uncertainty in deep learning for camera relocalization. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 4762–4769 (2016)

Digital Library

[29]

Bishop, C.M.: Mixture Density Networks (1994)

[30]

Bingham, C.: An antipodally symmetric distribution on the sphere. Ann. Stat. 2(6), 1201–1225 https://doi.org/10.1214/aos/1176342874(1974)

[31]

Chen, Y., Tanaka, K.: Maximum likelihood estimation of the Fisher–bingham distribution via efficient calculation of its normalizing constant. Stat. Comput. 31(4), 1–12 (2021). https://doi.org/10.1007/s11222-021-10015-9

[32]

Kume, A., Wood, A.: Saddlepoint approximations for the bingham and Fisher–bingham normalising constants. Biometrika 92(2), 465–476 (2005). https://doi.org/10.1093/biomet/92.2.465

[33]

Kume, A., Preston, S., Wood, A.: Saddlepoint approximations for the normalizing constant of Fisher–bingham distributions on products of spheres and stiefel manifolds. Biometrika 100(4), 971–984 (2013). https://doi.org/10.1093/biomet/ast021

[34]

Glover, J., Bradski, G., Rusu, R.B.: Monte Carlo pose estimation with quaternion kernels and the bingham distribution. Robot. Sci. Syst. 7, 97 (2012)

[35]

Kume, A., Sei, T.: On the exact maximum likelihood inference of Fisher–bingham distributions using an adjusted holonomic gradient method. Stat. Comput. 28(4), 835–847 (2018). https://doi.org/10.1007/s11222-017-9765-3

[36]

Sei, T., Kume, A.: Calculating the normalising constant of the bingham distribution on the sphere using the holonomic gradient method. Stat. Comput. 25(2), 321–332 (2015). https://doi.org/10.1007/s11222-013-9434-0

Digital Library

[37]

Paszke, A., et al.: Pytorch: an imperative style, high‐performance deep learning library. NeurIPS (2019)

[38]

Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014)

[39]

Mahendran, S., Ali, H., Vidal, R.: 3d pose regression using convolutional neural networks. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 2174–2182 (2017)

[40]

Pavlakos, G., et al.: 6‐dof object pose from semantic keypoints. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2011–2018 (2017)

Digital Library

[41]

Tulsiani, S., Malik, J.: Viewpoints and keypoints. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1510–1519 (2015)

[42]

Su, H., et al.: Render for cnn: viewpoint estimation in images using cnns trained with rendered 3d model views. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2686–2694 (2015)

Digital Library

[43]

Grabner, A., Roth, P.M., Lepetit, V.: 3d pose estimation and 3d model retrieval for objects in the wild. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3022–3031 (2018)

[44]

Mahendran, S., Ali, H., Vidal, R.: A mixed classification‐regression framework for 3d pose estimation from 2d images. ArXiv abs/1805.03225 (2018)

[45]

Liao, S., Gavves, E., Snoek, C.G.M.: Spherical regression: learning viewpoints, surface normals and 3d rotations on n‐spheres. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9751–9759 (2019)

Recommendations

Fingerprint ridge orientation estimation based on neural network
ISPRA'06: Proceedings of the 5th WSEAS International Conference on Signal Processing, Robotics and Automation

Ridge orientation is one of the fundamental features of a fingerprint image. Orientation estimation serves for feature extraction and matching and is the base of fingerprint recognition. Most existing orientation estimation methods are based on the ...
A systematic method for fingerprint ridge orientation estimation and image segmentation

This paper proposes a scheme for systematically estimating fingerprint ridge orientation and segmenting fingerprint image by means of evaluating the correctness of the ridge orientation based on neural network. The neural network is used to learn the ...
Fingerprint orientation field estimation using ridge projection

Orientation field plays an important role in fingerprint recognition system. This paper proposes a method for estimating four-direction orientation field. It consists of four steps: (i) preprocessing fingerprint image, (ii) determining the primary ridge ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IET Computer Vision

IET Computer Vision Volume 17, Issue 6

September 2023

108 pages

EISSN:1751-9640

DOI:10.1049/cvi2.v17.6

Issue’s Table of Contents

© 2023 The Authors. IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.

This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 14 March 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents