Abstract
Cost volume is an essential component of recent deep models for optical flow estimation and is usually constructed by calculating the inner product between two feature vectors. However, the standard inner product in the commonly-used cost volume may limit the representation capacity of flow models because it neglects the correlation among different channel dimensions and weighs each dimension equally. To address this issue, we propose a learnable cost volume (LCV) using an elliptical inner product, which generalizes the standard inner product by a positive definite kernel matrix. To guarantee its positive definiteness, we perform spectral decomposition on the kernel matrix and re-parameterize it via the Cayley representation. The proposed LCV is a lightweight module and can be easily plugged into existing models to replace the vanilla cost volume. Experimental results show that the LCV module not only improves the accuracy of state-of-the-art models on standard benchmarks, but also promotes their robustness against illumination change, noises, and adversarial perturbations of the input signals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)
Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Recogn. Mach. Intell. (PAMI) 33(3), 500–513 (2010)
Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_44
Cayley, A.: About the algebraic structure of the orthogonal group and the other classical groups in a field of characteristic zero or a prime characteristic. Reine Angewandte Mathematik 32, 1846 (1846)
Cheng, J., Tsai, Y.H., Wang, S., Yang, M.H.: Segflow: joint learning for video object segmentation and optical flow. In: IEEE International Conference on Computer Vision (ICCV) (2017)
Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) (2015)
Dosovitskiy, A., et al.: Flownet: learning optical flow with convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Hafner, D., Demetz, O., Weickert, J.: Why is the census transform good for robust optic flow computation? In: Kuijper, A., Bredies, K., Pock, T., Bischof, H. (eds.) SSVM 2013. LNCS, vol. 7893, pp. 210–221. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38267-3_18
Horn, B.K., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
Hui, T.W., Tang, X., Change Loy, C.: Liteflownet: a lightweight convolutional neural network for optical flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Hur, J., Roth, S.: Mirrorflow: exploiting symmetries in joint optical flow and occlusion estimation. IEEE International Conference on Computer Vision (ICCV), pp. 312–321 (2017)
Hur, J., Roth, S.: Iterative residual refinement for joint optical flow and occlusion estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Janai, J., Güney, F., Ranjan, A., Black, M.J., Geiger, A.: Unsupervised learning of multi-frame optical flow with occlusions. In: European Conference on Computer Vision (ECCV) (2018)
Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to basics: unsupervised learning of optical flow via brightness constancy and motion smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_1
Jolliffe, I.T.: Principal components in regression analysis. In: Principal Component Analysis, pp. 129–155. Springer, New York (1986). https://doi.org/10.1007/978-1-4757-1904-8_8
Kendall, A., et al.: End-to-end learning of geometry and context for deep stereo regression. In: IEEE International Conference on Computer Vision (ICCV) (2017)
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: Neural Information Processing Systems (NeurIPS) (2017)
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Flow-grounded spatial-temporal video prediction from still images. In: European Conference on Computer Vision (ECCV) (2018)
Lin, J., Gan, C., Han, S.: Tsm: temporal shift module for efficient video understanding. In: IEEE International Conference on Computer Vision (ICCV) (2019)
Liu, P., King, I., Lyu, M.R., Xu, J.: Ddflow: learning optical flow with unlabeled data distillation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2019)
Liu, P., Lyu, M.R., King, I., Xu, J.: Selflow: self-supervised learning of optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Meister, S., Hur, J., Roth, S.: Unflow: unsupervised learning of optical flow with a bidirectional census loss. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)
Menze, M., Heipke, C., Geiger, A.: Joint 3d estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)
Menze, M., Heipke, C., Geiger, A.: Object scene flow. ISPRS J. Photogrammetry Remote Sensing (JPRS) 140, 60–76 (2018)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Ranjan, A., Janai, J., Geiger, A., Black, M.J.: Attacking optical flow. In: IEEE International Conference on Computer Vision (ICCV) (2019)
Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. (IJCV) 47(1–3), 7–42 (2002). https://doi.org/10.1023/A:1014573219977
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Pwc-net: CNNs for optical flow using pyramid, warping, and cost volume. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Models matter, so does training: an empirical study of CNNs for optical flow estimation. IEEE Trans. Pattern Recogn. Mach. Intell. (PAMI) 42, 1408–1423 (2019)
Teed, Z., Deng, J.: Raft: recurrent all-pairs field transforms for optical flow. arXiv preprint arXiv:2003.12039 (2020)
Tsai, Y.H., Yang, M.H., Black, M.J.: Video segmentation via object flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Wang, Y., Yang, Y., Yang, Z., Zhao, L., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Xu, J., Ranftl, R., Koltun, V.: Accurate optical flow via direct cost volume processing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Yang, G., Ramanan, D.: Volumetric correspondence networks for optical flow. In: Neural Information Processing Systems (NeurIPS) (2019)
Yin, Z., Darrell, T., Yu, F.: Hierarchical discrete distribution decomposition for match density estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Yin, Z., Shi, J.: Geonet: unsupervised learning of dense depth, optical flow and camera pose. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. (JMLR) 17, 2287–2318 (2016)
Zou, Y., Luo, Z., Huang, J.B.: Df-net: unsupervised joint learning of depth and flow using cross-task consistency. In: European Conference on Computer Vision (ECCV) (2018)
Acknowledgements
This work is supported in part by NSF CAREER Grant 1149783. We also thank Pengpeng Liu and Jingfeng Wu for kind help.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 2 (mp4 17192 KB)
Supplementary material 3 (mp4 33238 KB)
Supplementary material 4 (mp4 27663 KB)
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, T. et al. (2020). Learnable Cost Volume Using the Cayley Representation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12354. Springer, Cham. https://doi.org/10.1007/978-3-030-58545-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-58545-7_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58544-0
Online ISBN: 978-3-030-58545-7
eBook Packages: Computer ScienceComputer Science (R0)